Exploring datasets
Outliers
Learn what outliers are, why they affect dataset quality, and how to detect and manage them using Visual Layer.
What Are Outlier Images and Objects?
Outlier images are visuals that differ significantly from the majority of your dataset. They may appear unusual due to content, quality, or context and can affect the representativeness of your data.
These may include:
- Images from the wrong source or domain
- Objects captured in rare or unexpected ways
- Files with artifacts like blur, noise, or extreme lighting
Common Causes of Outliers
- Data collection errors: Samples from unrelated categories or domains may be incorrectly included.
- Artifacts and anomalies: Distortions like blur, noise, or overexposure can make an image an outlier.
- Rare instances: Rare objects, edge-case events, or unconventional perspectives may introduce visual outliers.
Why It Matters
Problem | Impact |
---|---|
Reduced data quality | Outliers introduce noise and reduce consistency across your dataset. |
Weaker model performance | Models trained on unfiltered outliers may generalize poorly or become unstable in production. |
Hidden skew | Outliers may distort validation results or inflate perceived class diversity. |
How to Detect Outliers in Visual Layer
Visual Layer provides a one-click method for detecting outliers using automated issue detection. You can filter and isolate outlier data directly from the dataset interface.
Steps to Apply the Outlier Filter
- Go to the Dataset Inventory and open the dataset you want to inspect.
- Choose either Images view or Objects view.
- In the top filter bar, open the Issues filter.
- Select “Outliers” and apply the filter.
What You Can Do Next
Once the outliers are visible, you can choose how to proceed:
- Organize: Add the outliers to the Selected Items list for review or isolation.
- Export: Use the Export tool to send either:
- Only the outliers for data cleaning, or
- All data except the outliers (by using exclusion logic), to keep your training dataset clean.
Managing outliers is an essential step in building reliable, balanced models.