This tutorial shows you how to work with metadata.json files exported from Visual Layer. You’ll learn how to extract data using the API, parse the exported JSON files, and convert them to different formats for analysis.
url
, dataset_id
, and file_name
with your own values:
metadata.json
file. Let’s learn how to parse and analyze it.
schema_version | dataset | description | dataset_url | export_time | dataset_creation_time | exported_by | total_media_items |
---|---|---|---|---|---|---|---|
1.1 | food101 | Exported from food101 at Visual Layer | Link | 2025-02-10T14:29:53.740569 | 2024-12-05T05:55:27.725598 | Dickson Neoh | 118 |
media_id | media_type | file_name | file_path | file_size | uniqueness_score | height | width | url | cluster_id | metadata_items |
---|---|---|---|---|---|---|---|---|---|---|
d5227901-22c9-4744-a264-407d9671aa4a | image | 548938.jpg | 548938.jpg | 32.00KB | 0.004178 | 512 | 512 | Link | fbcad8ef-d863-46c9-83b7-1a3bd85e2e2b | [type’: ‘issue’, ‘properties’: issue_type’… |
2546c70a-e0a4-4bfb-ac59-e2895bb96456 | image | 548231.jpg | 548231.jpg | 32.00KB | 0.006069 | 512 | 512 | Link | fbcad8ef-d863-46c9-83b7-1a3bd85e2e2b | [type’: ‘issue’, ‘properties’: issue_type’… |
45c226e0-daba-4ca8-8eac-fec9d490ea36 | image | 835953.jpg | 835953.jpg | 28.18KB | 0.003515 | 512 | 384 | Link | 5bc63415-76d8-49ec-975a-9a021bf98770 | [type’: ‘issue’, ‘properties’: issue_type’… |
metadata_items
column contains a list of issues for each image. We can filter for images with duplicate issues above a certain confidence threshold:
Filename | X | Y | W | H | Label | User Tag |
---|---|---|---|---|---|---|
000000046252.jpg | 148 | 177 | 127 | 188 | Person | test-tags |
000000046252.jpg | 192 | 194 | 75 | 65 | Shirt | test-tags |
000000012639.jpg | 28 | 368 | 80 | 118 | Vest | test-tags |
cluster_id
field to find similar frames:
video_a | video_b | average_similarity | number_of_similar_frames |
---|---|---|---|
video1.mp4 | video2.mp4 | 0.9850 | 15 |
video3.mp4 | video4.mp4 | 0.9720 | 8 |
video_a | video_b | shared_clusters | frames_from_video_a | frames_from_video_b |
---|---|---|---|---|
video1.mp4 | video2.mp4 | 5 | 12 | 18 |
issues_description
: The suggested correct labelconfidence
: How confident the system is about the mislabel (0-1)