How This Helps

Identify duplicate media in your dataset using internal media IDs. This helps streamline cleanup, reduce redundancy, and improve data quality before training or export.


Step 1: Retrieve Internal Media ID

To begin, retrieve the internal media ID for a file based on its original_media_uri.

GET /api/v1/dataset/{dataset_id}/search_media_metadata/id?original_media_uri={url_encoded_original_media_uri}
Headers: Authorization: Bearer <jwt>

Example

curl -H "Authorization: Bearer <jwt>" \
     "https://app.visual-layer.com/api/v1/dataset/{dataset_id}/media/id?original_media_uri={url_encoded_original_media_uri}"

Response:

  • 200 OK: Returns the internal media ID as text
  • 404: Media not found in the dataset

Step 2: Retrieve Duplicates Using Media ID

Once you have the media ID, use this endpoint to find duplicates:

GET /api/v1/dataset/{dataset_id}/media/{media_id}/duplicates
Headers: Authorization: Bearer <jwt>

Example

curl -H "Authorization: Bearer <jwt>" \
     https://app.visual-layer.com/api/v1/dataset/{dataset_id}/media/{media_id}/duplicates

Response:

Returns a JSON array containing 0 or more duplicate media IDs.


This feature is currently under development and subject to change.