Skip to content

Consider ways to make the listing operations (ls, dups, uniques) more efficient #46

@lispstudent

Description

@lispstudent

Hello,

I am very glad to have found dupd, as it offers the best workflow for my use-case.

I have run the following command, on about 150TB data. It took about 70 hours:

# dupd scan --path /path1 --path /path2
Files:  2420698                           0 errors                       1354 s
                                                                              
Total duplicates: 2108486 files in 690968 groups in    238110 s
Run 'dupd report' to list duplicates

Then, I did:

cd /path2
dupd uniques

dupd started listing the files which are unique to /path2, but it is taking a very long time, with CPU tagged at about 50%.

Is this normal? I thought that since the files have been listed in a SQLite db, print such list would have been fast?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions