jamdupes

###Introduction jamdupes is a tool for finding duplicate files, inspired by fdupes, with added features and speed improvements.

###Speed improvements? Yes. Speed improvements. On my Raspberry Pi with USB hard drives, I observed the following:

$ time fdupes -R -S -n ~/files/ > /dev/null

real    2557m7.013s
user    1413m21.272s
sys     355m14.684s

$ time jamdupes -R -S -n ~/files/ > /dev/null

real    330m23.645s
user    133m30.824s
sys     37m45.276s

It took 5.5 hours instead of 42.6 hours. That's a 87% reduction!

###Okay, but how?

Sort files into buckets based on their size. The OS can calculate the size of files quickly, or maybe the data might be in RAM already!
Since any file that is unique in its filesize is clearly not a duplicate, you can safely ignore it and not hash it.
For really big files, hash the first few megabytes. Only hash the whole file if the first few megabytes match. This reflects the reality that headers and metadata are often at the beginning of a file. (Except for ZIP archives)

###Added features? Coming soon. See Issue tracker.

###Prerequisites python 3 (tested on Python 3.4.2 on armhf Linux and Python 3.5.1 on Darwin x86_64)

###Licence BSD-3

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
test_dir		test_dir
LICENSE		LICENSE
README.md		README.md
jamdupes		jamdupes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jamdupes

About

Releases

Packages

Languages

License

jimmale/jamdupes

Folders and files

Latest commit

History

Repository files navigation

jamdupes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages