Saturday, May 4, 2024
6
rated 0 times [  6] [ 0]  / answers: 1 / hits: 7113  / 2 Years ago, thu, september 22, 2022, 7:20:32

I just downloaded fdupes and was giving it a try. I am curious to know how the software goes about to determine which file it will put first when multiple files are found. I am running:



Distributor ID: Ubuntu
Description: Ubuntu 12.04.3 LTS
Release: 12.04
Codename: precise


Here is the command that I ran.



fdupes -Nrd /backup/local/fileserver_backup/home


in that "home" directory there are two directories with identical content (I used cp -r ./sam ./sam1):



sam/...



sam1/...



With the command above, I found that all the files were left in sam. But when I tried to run the same command with the following directory structure:



sa/...



sam/...



I found that all the files were still left in sam, not sa as I expected.



Now my Questions are:




  • Does fdupes always keep the oldest file?

  • How does it sort the files when finding the first and all subsequent duplicates?

  • Is this OS dependent?

  • Is this something that the user can control?



I have some 300000 lines of duplicate files. Being able to provide the software some guidance like, "always keep files in this directory when given the choice, skip if not available" or something like that would be great addition.


More From » command-line

 Answers
0

Here's a test I performed:



$ ls -lt -u -r */*.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample0.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample3.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 002/sample2.mp3
$ ls -lt -c -r */*.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 9 23:39 001/sample0.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 00:14 001/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 00:20 002/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 01:02 001/sample3.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 01:08 001/sample.mp3
$ ls -t -1r */*.mp3
001/sample0.mp3
001/sample3.mp3
001/sample2.mp3
002/sample2.mp3
001/sample.mp3
$ fdupes -r . | grep mp3
./001/sample0.mp3
./001/sample3.mp3
./001/sample2.mp3
./002/sample2.mp3
./001/sample.mp3
$ touch -a 001/sample2.mp3
$ ls -lt -u -r */*.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample0.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample3.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 002/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 22:29 001/sample2.mp3
$ ls -lt -c -r */*.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 9 23:39 001/sample0.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 00:20 002/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 01:02 001/sample3.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 01:08 001/sample.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 22:29 001/sample2.mp3
$ ls -t -1r */*.mp3
001/sample0.mp3
001/sample3.mp3
001/sample2.mp3
002/sample2.mp3
001/sample.mp3
$ fdupes -r . | grep mp3
./001/sample0.mp3
./001/sample3.mp3
./001/sample2.mp3
./002/sample2.mp3
./001/sample.mp3
$ touch -m 001/sample3.mp3
$ ls -lt -u -r */*.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample0.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 001/sample3.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 11:49 002/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 22:32 001/sample2.mp3
$ ls -lt -c -r */*.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 9 23:39 001/sample0.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 00:20 002/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 10 01:08 001/sample.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 22:29 001/sample2.mp3
-rwxrwxr-x 1 hash hash 3416208 Jan 11 22:34 001/sample3.mp3
$ ls -t -1r */*.mp3
001/sample0.mp3
001/sample2.mp3
002/sample2.mp3
001/sample.mp3
001/sample3.mp3
$ fdupes -r . | grep mp3
./001/sample0.mp3
./001/sample2.mp3
./002/sample2.mp3
./001/sample.mp3
./001/sample3.mp3
$ fdupes -rd ./001/ ./002/
[1] ./001/sample0.mp3
[2] ./001/sample2.mp3
[3] ./002/sample2.mp3
[4] ./001/sample.mp3
[5] ./001/sample3.mp3

Set 1 of 1, preserve files [1 - 5, all]: 4

[-] ./001/sample0.mp3
[-] ./001/sample2.mp3
[-] ./002/sample2.mp3
[+] ./001/sample.mp3
[-] ./001/sample3.mp3


Conclusion:



The duplicate files are sorted in reverse order of latest modification time. So, the first file in the set of duplicates is the oldest in term of modification time (mtime).



That means if you use fdupes -rdN [directory] ..., the file with oldest mtime in each set of duplicates will be preserved and the rest will be deleted.



References:




[#27617] Saturday, September 24, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
fulild

Total Points: 239
Total Questions: 103
Total Answers: 112

Location: Papua New Guinea
Member since Thu, Jul 9, 2020
4 Years ago
;