The new Fastdup V1.0 API follows much of the existing interface but tries to simplify the usage and avoid the need to provide paths and parameters repeatedly.
For the V1.0 API, input and work directories are set once at initialization. Parameters for the fastdup.run function are used in the .run() methods, following the same naming.Galleries and visualization are under the .vis subclass.
python
Copy
Ask AI
import fastdupfd = fastdup.create(work_dir="out", input_dir="/path/to/your/folder")fd.run(nearest_neighbors_k=5, ccthreshold=0.96)fd.vis.duplicates_gallery() #create a visual gallery of found duplicatesfd.vis.outliers_gallery() #create a visual gallery of anomaliesfd.vis.components_gallery() #create visualiaiton of connected componentsfd.vis.stats_gallery() #create visualization of images stastics (for example blur)
The previous (V0.2xx) API is still fully supported and no breaking changes were made.
For working with webdataset/ tar/ zip files containing images please use v0.2.
Copy
Ask AI
import fastdupfastdup.run(input_dir="/path/to/your/folder", work_dir='out', nearest_neighbors_k=5, turi_param='ccthreshold=0.96') #main running function.fastdup.create_duplicates_gallery('out/similarity.csv', save_path='.') #create a visual gallery of found duplicatesfastdup.create_outliers_gallery('out/outliers.csv', save_path='.') #create a visual gallery of anomaliesfastdup.create_components_gallery('out', save_path='.') #create visualiaiton of connected componentsfastdup.create_stats_gallery('out', save_path='.', metric='blur') #create visualization of images stastics (for example blur)