fixdupes.py - what is it?

This is a simple python program that takes a list of directories as command line argument. It goes through the directories, collecting the md5 checksum. It gives a list of files which are determined to be the same, and prompts which of the list is to be kept, then deletes the rest. There is the option to skip deletion.

This is handy for sorting through a large collections of mp3s, jpgs (for instance from a digicam) etc, and removing duplicates. The program is released under the GPL 'as is'.

The program was developed with Python 1.5.2 on RedHat Linux 7.0. It requires the snack modules and md5sum programs to be installed (should be on most recent GNU/Linux distributions.

On Debian the python-newt packgage is what is needed to get it all up and running.

17/10/2004

Minor update - added the python md5 module version I have been using for a while, plus added verbose and text report only modes.

download