Learn what outliers are, why they affect dataset quality, and how to detect and manage them using Visual Layer.
Outlier images are visuals that differ significantly from the majority of your dataset. These anomalies may affect model performance, skew results, or reveal labeling issues.
Outliers typically fall into one of the following categories:
Categories
Description
Domain or Content Outliers
Images that feel “out of place” compared to the rest of your data. From a different source, domain, or modality (e.g., a drawing in a photo dataset). Captured in unexpected conditions (e.g., extreme lighting, occlusions). Showing objects in rare or ambiguous contexts.
Quality Outliers
Images that are technically flawed or inconsistent with the rest of the set. Blurry, overexposed, or too dark. Containing heavy compression artifacts or noise. Corrupted or visually incomplete.
Class (Label) Outliers
Images that don’t semantically match any class in your label vocabulary. True outliers: e.g., an image of a sheep in a dog/cat dataset — it doesn’t belong to any existing class. Out-of-distribution examples: e.g., a sketch of a dog in a dataset of real dog photos. These are different from mislabels, where an image is mislabeled but still belongs to a known class.
Visual Layer provides a one-click method for detecting and correcting outliers using automated issue detection.
Detect Outliers:
Go to “Add Filter” → “Outliers” → select “IS” as the logic operator → set the desired confidence threshold (default is 0.5).
Export the results using “Matching the applied filter.”
Correct Outliers:
Go to “Add Filter” → “Outliers” → select “IS NOT” as the logic operator → set the desired confidence threshold (default is 0.5).
Export the results using “Matching the applied filter.”
Managing outliers is an essential step in building reliable, balanced models.