How This Helps Identify duplicate media in your dataset using internal media IDs. This helps streamline cleanup, reduce redundancy, and improve data quality before training or export.
To begin, retrieve the internal media ID for a file based on its original_media_uri.
GET /api/v1/dataset/{dataset_id}/search_media_metadata/id?original_media_uri={url_encoded_original_media_uri}
Headers : Authorization: Bearer <jwt>
Example
curl -H "Authorization: Bearer <jwt>" \
"https://app.visual-layer.com/api/v1/dataset/{dataset_id}/media/id?original_media_uri={url_encoded_original_media_uri}"
Response:
✅ 200 OK: Returns the internal media ID as text
❌ 404: Media not found in the dataset
Once you have the media ID, use this endpoint to find duplicates:
GET /api/v1/dataset/{dataset_id}/media/{media_id}/duplicates
Headers : Authorization: Bearer <jwt>
Example
curl -H "Authorization: Bearer <jwt>" \
https://app.visual-layer.com/api/v1/dataset/{dataset_id}/media/{media_id}/duplicates
Response:
Returns a JSON array containing 0 or more duplicate media IDs.
This feature is currently under development and subject to change.