Skip to main content

What This Helps With

Automate the process of uploading custom metadata from a folder containing images and their metadata files to Visual Layer datasets, with automatic field discovery and type detection.
This page provides an example Python script that demonstrates how to automate custom metadata uploads to Visual Layer. The script scans a folder for images and their corresponding metadata files, automatically discovers all fields, and handles the complete upload workflow.
This is a specific use case example script. You can modify the field detection logic and processing to suit your particular needs. For advanced scenarios, custom metadata deletion, or additional support, contact Visual Layer for assistance.

Working with DICOM Medical Imaging Data?

For medical imaging workflows, check out our specialized DICOM converter and upload scripts that handle DICOM-specific field types, date/time formats, and medical metadata standards.

View Complete Script Code

Access the full custom metadata upload Python script with complete implementation, ready to copy and use.

How the Script Works

The automation script demonstrates the complete Visual Layer custom metadata API workflow:
  1. Automatically exports dataset - Calls Visual Layer API to get filename-to-media_id mapping (no manual export needed!)
  2. Scans folder for metadata - Finds images and their corresponding .metadata.json files
  3. Discovers all fields - Auto-detects all metadata fields across your files (or uses manual specification)
  4. Analyzes field types - Intelligently detects data types (float, datetime, enum, multi-enum, link, string)
  5. Creates custom metadata fields - Creates fields in Visual Layer via API calls
  6. Uploads metadata values - Uploads values for each field using the proper JSON format
  7. Monitors upload progress - Waits for completion and reports status

Field Type Detection Logic

The script uses intelligent heuristics to detect field types in auto-discovery mode:
Field TypeDetection CriteriaExamples
floatContains decimal point or scientific notation23.5, 1.2e-3, 0.95
datetimeParseable by pandas as date/time2024-01-15, 2024-01-15T10:30:00Z
linkStarts with http://, https://, or ftp://https://example.com, ftp://server.com
multi-enumArray with ≀20 unique values across all files["tag1", "tag2"]
enum≀100 unique string values across all files"approved", "pending", "rejected"
stringDefault for everything elseAny text, nested objects (JSON-serialized)
The script samples up to 100 values per field for accurate enum detection. If a field has more than 100 unique values, it’s treated as a string type instead of enum.

Key Features

Automatic Dataset Export:
  • No manual export step required - the script automatically fetches the filename-to-media_id mapping
Auto-Discovery Mode:
  • Automatically finds and processes ALL metadata fields without configuration
  • Ideal when you want to upload everything
Manual Override:
  • Use --field flag to specify only certain fields
  • Useful when you want selective upload or need to force specific types
Intelligent Type Detection:
  • Auto-detects 6 field types (float, datetime, enum, multi-enum, link, string)
Robust Error Handling:
  • Falls back to string type if conversion fails
  • Automatically truncates strings to Visual Layer’s 255-character limit
  • Handles nested objects by serializing to JSON
Adding vs. Updating Metadata:
  • Adding new fields: In case custom metadata was already uploaded to a dataset, the script will add new fields.
  • Updating existing fields: Currently updating existing metadata is not supported. Contact Visual Layer for assistance with updating existing metadata values

Prerequisites

Before using the script, ensure you have:
  1. Visual Layer on-premises installation (script designed for no-authentication environments)
  2. Dataset in READY state in Visual Layer with images already uploaded
  3. Python environment with required packages: pandas, requests
  4. Folder structure with images and .metadata.json files

Expected Folder Structure

The script expects images with corresponding metadata files:
your-folder/
β”œβ”€β”€ image1.jpg
β”œβ”€β”€ image1.jpg.metadata.json
β”œβ”€β”€ image2.png
β”œβ”€β”€ image2.png.metadata.json
β”œβ”€β”€ image3.jpg
└── image3.jpg.metadata.json
The .metadata.json file naming convention can be customized in the script’s scan_folder() function to match your specific metadata storage format and naming patterns.

Alternative: Single JSON File

If your metadata is in a single JSON file instead of individual files per image: Example JSON format:
{
  "image1.jpg": {
    "confidence": 0.95,
    "category": "approved"
  },
  "image2.jpg": {
    "confidence": 0.87,
    "category": "pending"
  }
}
To use this format: Add the load_single_json_metadata() method to the FolderMetadataProcessor class and call it instead of scan_folder() + load_metadata_files() in the workflow.

View Single JSON Implementation

See the complete code example for handling single JSON files at the bottom of the page.

Metadata File Format

Each .metadata.json file should contain field-value pairs:
{
  "annotated_at": "2024-01-15T10:30:00Z",
  "confidence": 0.95,
  "category": "approved",
  "reviewer": "john_doe",
  "tags": ["quality", "verified"],
  "notes": "High quality sample"
}
Supported Value Types:
  • Strings: "approved", "warehouse_a"
  • Numbers: 23.5, 0.95, 100
  • Dates/Times: "2024-01-15", "2024-01-15T10:30:00Z"
  • Arrays: ["tag1", "tag2"] (for multi-enum fields)
  • Objects: {"key": "value"} (serialized to JSON string)

Expected Result in Visual Layer

After running the script, your custom metadata fields will be visible in Visual Layer’s interface for each image:
Custom metadata displayed in Visual Layer showing fields like name, float value, list of tags, condition, and date
These fields become searchable and filterable, enabling you to query and analyze your dataset based on your custom metadata.

Installation and Usage

Install Dependencies

pip install pandas requests
The script automatically discovers and uploads ALL metadata fields:
python upload_metadata_from_folder.py \
  --folder /path/to/folder \
  --dataset-id your-dataset-id \
  --base-url http://localhost:2080

Manual Field Specification

Specify only certain fields with explicit types:
python upload_metadata_from_folder.py \
  --folder /path/to/folder \
  --dataset-id your-dataset-id \
  --base-url http://localhost:2080 \
  --field annotated_at datetime \
  --field confidence float \
  --field category enum \
  --field tags multi-enum

Command Line Parameters

ParameterRequiredDescriptionDefault
--folderβœ…Path to folder containing images and .metadata.json files-
--dataset-idβœ…Visual Layer dataset identifier-
--base-urlβ­•Visual Layer installation URLhttp://localhost:2080
--fieldβ­•Specify field name and type (repeatable). If omitted, auto-discovers all fields. Types: `stringfloatdatetimeenummulti-enumlink`-

Example Commands

Auto-discover all fields (recommended):
python upload_metadata_from_folder.py \
  --folder ./annotated_images \
  --dataset-id abc123 \
  --base-url http://localhost:2080
Upload only specific fields:
python upload_metadata_from_folder.py \
  --folder ./data \
  --dataset-id my-dataset \
  --base-url http://visual-layer:3000 \
  --field quality_score float \
  --field status enum

Script Output

The script provides detailed progress information:

Auto-Discovery Mode Output

πŸš€ Starting Folder Metadata Upload Workflow
πŸ“ Folder: /path/to/folder

πŸ“€ Exporting dataset to get media_id mappings...
   βœ… Exported 150 media items

πŸ” Scanning folder: /path/to/folder
   βœ… Found 150 image + metadata pairs

πŸ“– Loading metadata files...
   βœ… Loaded 150 metadata files

πŸ€– Auto-detection mode
πŸ” Discovering and analyzing all fields...
   πŸ“‹ Found 8 unique fields
      annotated_at: datetime
      confidence: float
      category: enum
      reviewer: string
      tags: multi-enum
      priority: enum
      notes: string
      source_url: link
   βœ… Analyzed 8 fields

🎯 Processing 8 custom fields...

πŸ”„ Processing field: annotated_at (datetime)
πŸ”§ Creating custom field: annotated_at (datetime)
   βœ… Created field with task ID: task-123
   πŸ“€ Uploading data for field: annotated_at
   πŸ“Š Processed 150 files, skipped 0
   βœ… Upload completed successfully
   ⏳ Waiting for task completion... (Press Ctrl+C to stop)
   βœ… Task completed successfully, 150 rows inserted
   βœ… Task completed after 3s
   βœ… Field annotated_at completed

πŸ”„ Processing field: confidence (float)
...

πŸŽ‰ Workflow completed!
βœ… Successfully processed 8/8 fields

Manual Mode Output

πŸš€ Starting Folder Metadata Upload Workflow
πŸ“ Folder: /path/to/folder

πŸ“€ Exporting dataset to get media_id mappings...
   βœ… Exported 150 media items

πŸ” Scanning folder: /path/to/folder
   βœ… Found 150 image + metadata pairs

πŸ“– Loading metadata files...
   βœ… Loaded 150 metadata files

🎯 User-specified fields: 2
πŸ” Validating user-specified fields...
   Checking field: quality_score (float)
      βœ… Valid
   Checking field: status (enum)
      βœ… Valid
   βœ… Validated 2/2 fields

🎯 Processing 2 custom fields...
...

Customizing the Script

The script is designed to be modified for your specific use case. Here are common customizations:

Modifying Field Detection

You can customize how the script detects field types by modifying the _auto_detect_field_type() function:
def _auto_detect_field_type(self, field_name: str, sample_values: List[Any]) -> str:
    # Force specific fields to certain types
    if field_name in ['product_id', 'batch_number']:
        return 'string'  # Never treat as enum

    if field_name.endswith('_url'):
        return 'link'  # Force URL fields

    # Adjust enum threshold
    if len(unique_values) <= 50:  # Changed from 100
        return 'enum'

    # ... rest of detection logic

Skipping Certain Fields

Modify the discover_and_analyze_fields() function to skip fields:
def discover_and_analyze_fields(self, metadata_by_file):
    # Skip internal fields
    skip_fields = {'_internal_id', '_batch_number', 'temp_field'}

    for field_name in sorted(all_fields):
        if field_name in skip_fields:
            continue
        # ... rest of logic

Custom Value Conversion

Modify the _convert_value() function for custom data transformations:
def _convert_value(self, value: Any, field_name: str, field_type: str) -> Any:
    # Custom preprocessing
    if field_name == 'temperature':
        # Convert Fahrenheit to Celsius
        return (float(value) - 32) * 5/9

    if field_name.startswith('date_'):
        # Custom date format handling
        return custom_date_parser(value)

    # ... rest of conversion logic

Troubleshooting

Common Issues

No image + metadata pairs found:
  • Verify your folder contains images with corresponding .metadata.json files
  • Check filename format: image.jpg should have image.jpg.metadata.json
  • Ensure proper file permissions
Export dataset failed:
  • Confirm your Visual Layer instance is accessible at the specified URL
  • Check that the dataset ID is correct and the dataset is in READY state
  • Verify the /api/v1/dataset/{dataset_id}/export_media_id endpoint is accessible
Field validation failed:
  • Check that specified field names exist in your metadata files
  • Verify field types match your data (e.g., datetime fields have valid dates)
  • Review enum fields don’t exceed 20 unique values (API limit)
Upload timeouts:
  • Large datasets may take time to process
  • The script polls indefinitely - you can interrupt with Ctrl+C if needed
  • Check Visual Layer logs for any processing errors
Type conversion errors:
  • Review the automatic field type detection results
  • Use manual mode (--field) to specify correct types
  • Modify the _convert_value() function for custom data formats
String truncation warnings:
  • Visual Layer has a 255-character limit for string fields
  • The script automatically truncates - check if this affects your data
  • Consider using link type for long URLs

Getting Help

For advanced use cases, custom metadata deletion, or technical support:

Contact Visual Layer Support

Reach out to Visual Layer’s support team for assistance with complex metadata workflows, custom field requirements, or troubleshooting upload issues.

Advantages Over Manual Upload

Automatic Export:
  • ❌ Manual: Export dataset β†’ download metadata.json β†’ run script
  • βœ… Automatic: Just run script - it exports automatically
Field Discovery:
  • ❌ Manual: List all fields and types in configuration
  • βœ… Auto-discovery: Script finds all fields automatically
Type Detection:
  • ❌ Manual: Guess field types or trial-and-error
  • βœ… Intelligent: Script detects types from actual data
Error Handling:
  • ❌ Manual: Script fails on type conversion errors
  • βœ… Robust: Falls back to string, truncates long values
Flexibility:
  • βœ… Both modes: Auto-discovery for convenience, manual for control
⌘I