fdupes – A Command Line Tool to Find and Delete Duplicate Files in Linux

It is a common requirement to find and replace duplicate files for most of the computer users. Finding and removing duplicate files is a tiresome job that demands time and patience. Finding duplicate files can be very easy if your machine is powered by GNU/Linux, thanks to ‘fdupes‘ utility.

Find and Delete Duplicate Files in Linux
Fdupes – Find and Delete Duplicate Files in Linux

What is fdupes?

Fdupes is a Linux utility written by Adrian Lopez in C programming Language released under MIT License. The application is able to find duplicate files in the given set of directories and sub-directories. Fdupes recognize duplicates by comparing MD5 signature of files followed by a byte-to-byte comparison. A lots of options can be passed with Fdupes to list, delete and replace the files with hardlinks to duplicates.

The comparison starts in the order:

size comparison > Partial MD5 Signature Comparison > Full MD5 Signature Comparison > Byte-to-Byte Comparison.

Install fdupes on a Linux

Installation of latest version of fdupes (fdupes version 1.51) as easy as running following command on Debian based systems such as Ubuntu and Linux Mint.

$ sudo apt-get install fdupes

On CentOS/RHEL and Fedora based systems, you need to turn on epel repository to install fdupes package.

# yum install fdupes
# dnf install fdupes    [On Fedora 22 onwards]

Note: The default package manager yum is replaced by dnf from Fedora 22 onwards…

How to use fdupes command?

1. For demonstration purpose, let’s a create few duplicate files under a directory (say tecmint) simply as:

$ mkdir /home/"$USER"/Desktop/tecmint && cd /home/"$USER"/Desktop/tecmint && for i in {1..15}; do echo "I Love Tecmint. Tecmint is a very nice community of Linux Users." > tecmint${i}.txt ; done

After running above command, let’s verify the duplicates files are created or not using ls command.

$ ls -l

total 60
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint10.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint11.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint12.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint13.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint14.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint15.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint1.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint2.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint3.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint4.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint5.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint6.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint7.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint8.txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint9.txt

The above script create 15 files namely tecmint1.txt, tecmint2.txt…tecmint15.txt and every files contains the same data i.e.,

"I Love Tecmint. Tecmint is a very nice community of Linux Users."

2. Now search for duplicate files within the folder tecmint.

$ fdupes /home/$USER/Desktop/tecmint 

/home/tecmint/Desktop/tecmint/tecmint13.txt
/home/tecmint/Desktop/tecmint/tecmint8.txt
/home/tecmint/Desktop/tecmint/tecmint11.txt
/home/tecmint/Desktop/tecmint/tecmint3.txt
/home/tecmint/Desktop/tecmint/tecmint4.txt
/home/tecmint/Desktop/tecmint/tecmint6.txt
/home/tecmint/Desktop/tecmint/tecmint7.txt
/home/tecmint/Desktop/tecmint/tecmint9.txt
/home/tecmint/Desktop/tecmint/tecmint10.txt
/home/tecmint/Desktop/tecmint/tecmint2.txt
/home/tecmint/Desktop/tecmint/tecmint5.txt
/home/tecmint/Desktop/tecmint/tecmint14.txt
/home/tecmint/Desktop/tecmint/tecmint1.txt
/home/tecmint/Desktop/tecmint/tecmint15.txt
/home/tecmint/Desktop/tecmint/tecmint12.txt

3. Search for duplicates recursively under every directory including it’s sub-directories using the -r option.

It search across all the files and folder recursively, depending upon the number of files and folders it will take some time to scan duplicates. In that mean time, you will be presented with the total progress in terminal, something like this.

$ fdupes -r /home

Progress [37780/54747] 69%

4. See the size of duplicates found within a folder using the -S option.

$ fdupes -S /home/$USER/Desktop/tecmint

65 bytes each:                          
/home/tecmint/Desktop/tecmint/tecmint13.txt
/home/tecmint/Desktop/tecmint/tecmint8.txt
/home/tecmint/Desktop/tecmint/tecmint11.txt
/home/tecmint/Desktop/tecmint/tecmint3.txt
/home/tecmint/Desktop/tecmint/tecmint4.txt
/home/tecmint/Desktop/tecmint/tecmint6.txt
/home/tecmint/Desktop/tecmint/tecmint7.txt
/home/tecmint/Desktop/tecmint/tecmint9.txt
/home/tecmint/Desktop/tecmint/tecmint10.txt
/home/tecmint/Desktop/tecmint/tecmint2.txt
/home/tecmint/Desktop/tecmint/tecmint5.txt
/home/tecmint/Desktop/tecmint/tecmint14.txt
/home/tecmint/Desktop/tecmint/tecmint1.txt
/home/tecmint/Desktop/tecmint/tecmint15.txt
/home/tecmint/Desktop/tecmint/tecmint12.txt

5. You can see the size of duplicate files for every directory and subdirectories encountered within using the -S and -r options at the same time, as:

$ fdupes -Sr /home/avi/Desktop/

65 bytes each:                          
/home/tecmint/Desktop/tecmint/tecmint13.txt
/home/tecmint/Desktop/tecmint/tecmint8.txt
/home/tecmint/Desktop/tecmint/tecmint11.txt
/home/tecmint/Desktop/tecmint/tecmint3.txt
/home/tecmint/Desktop/tecmint/tecmint4.txt
/home/tecmint/Desktop/tecmint/tecmint6.txt
/home/tecmint/Desktop/tecmint/tecmint7.txt
/home/tecmint/Desktop/tecmint/tecmint9.txt
/home/tecmint/Desktop/tecmint/tecmint10.txt
/home/tecmint/Desktop/tecmint/tecmint2.txt
/home/tecmint/Desktop/tecmint/tecmint5.txt
/home/tecmint/Desktop/tecmint/tecmint14.txt
/home/tecmint/Desktop/tecmint/tecmint1.txt
/home/tecmint/Desktop/tecmint/tecmint15.txt
/home/tecmint/Desktop/tecmint/tecmint12.txt

107 bytes each:
/home/tecmint/Desktop/resume_files/r-csc.html
/home/tecmint/Desktop/resume_files/fc.html

6. Other than searching in one folder or all the folders recursively, you may choose to choose in two folders or three folders as required. Not to mention you can use option -S and/or -r if required.

$ fdupes /home/avi/Desktop/ /home/avi/Templates/

7. To delete the duplicate files while preserving a copy you can use the option ‘-d’. Extra care should be taken while using this option else you might end up loosing necessary files/data and mind it the process is unrecoverable.

$ fdupes -d /home/$USER/Desktop/tecmint

[1] /home/tecmint/Desktop/tecmint/tecmint13.txt
[2] /home/tecmint/Desktop/tecmint/tecmint8.txt
[3] /home/tecmint/Desktop/tecmint/tecmint11.txt
[4] /home/tecmint/Desktop/tecmint/tecmint3.txt
[5] /home/tecmint/Desktop/tecmint/tecmint4.txt
[6] /home/tecmint/Desktop/tecmint/tecmint6.txt
[7] /home/tecmint/Desktop/tecmint/tecmint7.txt
[8] /home/tecmint/Desktop/tecmint/tecmint9.txt
[9] /home/tecmint/Desktop/tecmint/tecmint10.txt
[10] /home/tecmint/Desktop/tecmint/tecmint2.txt
[11] /home/tecmint/Desktop/tecmint/tecmint5.txt
[12] /home/tecmint/Desktop/tecmint/tecmint14.txt
[13] /home/tecmint/Desktop/tecmint/tecmint1.txt
[14] /home/tecmint/Desktop/tecmint/tecmint15.txt
[15] /home/tecmint/Desktop/tecmint/tecmint12.txt

Set 1 of 1, preserve files [1 - 15, all]: 

You may notice that all the duplicates are listed and you are prompted to delete, either one by one or certain range or all in one go. You may select a range something like below to delete files files of specific range.

Set 1 of 1, preserve files [1 - 15, all]: 2-15

   [-] /home/tecmint/Desktop/tecmint/tecmint13.txt
   [+] /home/tecmint/Desktop/tecmint/tecmint8.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint11.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint3.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint4.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint6.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint7.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint9.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint10.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint2.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint5.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint14.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint1.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint15.txt
   [-] /home/tecmint/Desktop/tecmint/tecmint12.txt

8. From safety point of view, you may like to print the output of ‘fdupes’ to file and then check text file to decide what file to delete. This decrease chances of getting your file deleted accidentally. You may do:

$ fdupes -Sr /home > /home/fdupes.txt

Note: You may replace ‘/home’ with the your desired folder. Also use option ‘-r’ and ‘-S’ if you want to search recursively and Print Size, respectively.

9. You may omit the first file from each set of matches by using option ‘-f’.

First List files of the directory.

$ ls -l /home/$USER/Desktop/tecmint

total 20
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint9 (3rd copy).txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint9 (4th copy).txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint9 (another copy).txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint9 (copy).txt
-rw-r--r-- 1 tecmint tecmint 65 Aug  8 11:22 tecmint9.txt

and then omit the first file from each set of matches.

$ fdupes -f /home/$USER/Desktop/tecmint

/home/tecmint/Desktop/tecmint9 (copy).txt
/home/tecmint/Desktop/tecmint9 (3rd copy).txt
/home/tecmint/Desktop/tecmint9 (another copy).txt
/home/tecmint/Desktop/tecmint9 (4th copy).txt

10. Check installed version of fdupes.

$ fdupes --version

fdupes 1.51

11. If you need any help on fdupes you may use switch ‘-h’.

$ fdupes -h

Usage: fdupes [options] DIRECTORY...

 -r --recurse     	for every directory given follow subdirectories
                  	encountered within
 -R --recurse:    	for each directory given after this option follow
                  	subdirectories encountered within (note the ':' at
                  	the end of the option, manpage for more details)
 -s --symlinks    	follow symlinks
 -H --hardlinks   	normally, when two or more files point to the same
                  	disk area they are treated as non-duplicates; this
                  	option will change this behavior
 -n --noempty     	exclude zero-length files from consideration
 -A --nohidden    	exclude hidden files from consideration
 -f --omitfirst   	omit the first file in each set of matches
 -1 --sameline    	list each set of matches on a single line
 -S --size        	show size of duplicate files
 -m --summarize   	summarize dupe information
 -q --quiet       	hide progress indicator
 -d --delete      	prompt user for files to preserve and delete all
                  	others; important: under particular circumstances,
                  	data may be lost when using this option together
                  	with -s or --symlinks, or when specifying a
                  	particular directory more than once; refer to the
                  	fdupes documentation for additional information
 -N --noprompt    	together with --delete, preserve the first file in
                  	each set of duplicates and delete the rest without
                  	prompting the user
 -v --version     	display fdupes version
 -h --help        	display this help message

That’s for all now. Let me know how you were finding and deleting duplicates files till now in Linux? and also tell me your opinion about this utility. Put your valuable feedback in the comment section below and don’t forget to like/share us and help us get spread.

I am working on another utility called fslint to remove duplicate files, will soon post and you people will love to read.

Hey TecMint readers,

Exciting news! Every month, our top blog commenters will have the chance to win fantastic rewards, like free Linux eBooks such as RHCE, RHCSA, LFCS, Learn Linux, and Awk, each worth $20!

Learn more about the contest and stand a chance to win by sharing your thoughts below!

Avishek
A Passionate GNU/Linux Enthusiast and Software Developer with over a decade in the field of Linux and Open Source technologies.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

28 Comments

Leave a Reply
  1. I am James. I have been looking at this and copying and pasting a lot of items, but I am not getting anywhere. After going through this, can you help?

    > Note: Please use the saved script below for removal, not the above output.

    ==> In total, 15,085 files were scanned, of which 9,221 are duplicates in 1,627 groups.
    ==> This equals 343.72 GB of duplicates which could be removed.
    ==> 2,828 other suspicious items found, which may vary in size.
    ==> Scanning took a total of 4h 32m 33.995s.
    Wrote a JSON file to: /home/james/rmlint.json
    Wrote a SH file to: /home/james/rmlint.sh
    

    I think this is the 2nd part that I have copied and pasted. May 22, 2024:

    /home/james/.mozilla/firefox/63dm9l4n.default-release/storage/default/https+++go.games4grandma.com/ls/usage
    /home/james/.mozilla/firefox/63dm9l4n.default-release/storage/default/https+++support.google.com/ls/usage
    
    /home/james/rmlint.2024-05-13T00:46:55Z.sh
    /home/james/rmlint.2024-05-13T01:29:40Z.sh
    /home/james/rmlint.2024-05-13T01:32:53Z.sh
    /home/james/rmlint.2024-05-13T01:36:44Z.sh
    
    /home/james/.thunderbird/74y4qawf.default-release/crashes/store.json.mozlz4
    /home/james/.mozilla/firefox/63dm9l4n.default-release/crashes/store.json.mozlz4
    
    /home/james/USB STICK/.mozilla/firefox/bjbrigvv.default-release/Telemetry.FailedProfileLocks.txt
    /home/james/.config/libreoffice/4/user/extensions/shared/lastsynchronized
    /home/james/.config/libreoffice/4/user/extensions/bundled/lastsynchronized
    
    

    I hope this helps!

    Reply
  2. Hello, and thank you for the insightful guide on using fdupes for managing duplicate files. I’ve found the information extremely helpful. However, I’ve encountered some unexpected behavior that I’m struggling to understand.

    I recently executed the command:

    bash
    fdupes -Sr /path1/to/directory_1 /path2/to/directory_2
    

    My expectation was that the output would prioritize listing duplicate files in /path1/to/directory_1 first, followed by duplicates in /path2/to/directory_2, based on the order in which the directories are specified in the command. But this isn’t what’s happening.

    Instead, the order of the output doesn’t seem to align with the order of the directory paths provided in the command. This is intriguing, as I was under the impression that fdupes would process the directories sequentially based on their order in the command line.

    Reply
    • @Antonio,

      It seems there might be a misunderstanding about how fdupes processes the directories. The order of the output isn’t necessarily tied to the order of the directories specified in the command. Instead, fdupes scans all the directories simultaneously and then lists the duplicates collectively, regardless of their original directory.

      Reply
  3. fdupes: Supplied with the serious version of Linux I have (and I suppose others) and has numerous contributors, therefore very well tested. There are lots of forks (see GitHub), which may or may not introduce as many bugs as enhancements to these versions :)

    Thanks to Tecmint for this page.

    Reply
  4. There are many special programs for removing duplicate files. I am using Manyprog Find Duplicate Files. It is very easy to use.

    Reply
  5. It might be a great option to tell preserve newer only when needed to mirror a directory structure while backing up a source to target with rsync and we have moved files in another folder on the source before a second backup.

    Reply
  6. I was always trying the best way to delete my duplicate files, but always something was piling the storage again. So I find efficient software (DuplicateFilesDeleter) to complete remove the duplicate data. I suggest it to anyone with this problem.

    Reply
  7. Great program! Wish it had a filter to exclude files under a certain size. I’m using this to clean up disc space, so its great if it would only report size > 1MB, etc.

    Reply
  8. Fdupes is a great tool.

    Is it possible to specify which of 2 folders is sacrificial?

    So, If I have 2 folders A & B and run Fdupes on them, I want any duplicate to be automatically deleted from B.

    Reply
  9. Please note it is not safe to remove all the duplicates CCleaner finds. The Duplicate Finder can search for files with the same File Name, Size, Modified Date and Content; however it isn’t able to determine which files are needed and which can be safely deleted.

    Reply
  10. This is great for systems I ssh into, thank you. Only shortcoming is that listing file sizes in bytes only is difficult to manage with many files of large size.

    Reply
  11. I miss the -L (–linkhard) Option, which generates a hardlink instead of deleting files. In Version 1.50 Pre2 I had it, but on newer Versions, this Option is gone.

    Reply
  12. I have tried this software known as Duplicate Files Deleter. It works really well. It is fast and easy and clears up a lot of space.

    Reply
  13. In section 7, you say -d specifies what to delete, when in fact, the prompt is to specify what to preserve. What is not preserved is deleted. In your example, there are 15 files that are identical. If I select 1, it is preserved and the other 14 are deleted.

    Reply
  14. A tool to find duplicates will be quite useful for us. Of course, it should not delete any files, but replace identical files with hard links to the same inode. (the files to be compared already have their MD5 checksums computed).

    Reply
    • Thanks @Dan the Man, for finding this post useful. You have the option to find the duplicates with hard link and then to delete or not-to-delete. This is a very nice utility and most importantly, the approach to find duplicates is very good. Thinking to write a tool of myself for this purpose.
      Keep connected for more such posts.

      Reply
  15. This is a great tool. It would be neat to have an option to replace the copy with links to the original. I suppose this could be scripted from the log file that is generated.

    Thanks!

    Reply
    • Dear JC,
      You can use the flag ‘-d’ to delete the duplicates. I will suggest you to go through point – 7.
      Keep connected for more such posts.

      Reply
    • Ah! that’s a conversion error. Thanks for your feedback. Asking admin to correct “Commandline” in heading as “command line” or “command-line”.
      Keep connected @xhark.

      Reply

Got Something to Say? Join the Discussion...

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.