Import pre-existing annotation files into Visual Layer using supported formats including CSV, Parquet, JSON, COCO, and YOLO.
Annotations are metadata labels that describe and categorize images or objects within images. They structure visual datasets and enable search, analysis, and AI training workflows.
Annotations must be uploaded when creating a dataset and cannot be added later.
For full-image class labels, create a file named image_annotations.csv or image_annotations.parquet. Each row represents an image and its corresponding label.Format:
filename
label
IDX_DF_SIG21341_PlasmasNeg.png
IDX_DF
IDX_DF_ALM00324_PlasmasPos.png
IDX_DF
folder/IDX_RC_ALM04559_PlasmasNeg.png
IDX_RC
Requirements:
The filename column must contain relative paths
The label column assigns a class to the entire image
Multiple labels can be stored as a list: ['t-shirt', 'SKU12345']
You may include a caption column for textual metadata
For object-level annotations, create a file named object_annotations.csv or object_annotations.parquet. Each row represents a detected object with bounding box coordinates and class label.Format:
filename
col_x
row_y
width
height
label
Kitti/raw/training/image_2/006149.png
0
240
135
133
Car
Kitti/raw/training/image_2/006149.png
608
169
59
43
Car
Requirements:
col_x and row_y define the top-left corner of the bounding box
width and height must be greater than zero
Each row corresponds to a single object within an image
Different annotation tools and datasets use different bounding box coordinate systems. This section compares common formats and shows you how to convert them to Visual Layer’s format.
Each format is optimized for different use cases. COCO and VOC are widely used in academic datasets, YOLO for real-time detection, TFRecord for TensorFlow-based training, and LabelMe for manual annotations.
Segmentation masks use polygon coordinates to define object boundaries. You can convert these to bounding boxes by finding the minimum and maximum x,y values.Example Segmentation Mask Format:
min_x = minimum x-coordinate from all pointsmax_x = maximum x-coordinate from all pointsmin_y = minimum y-coordinate from all pointsmax_y = maximum y-coordinate from all pointscol_x = min_x (left edge)row_y = min_y (top edge)width = max_x - min_xheight = max_y - min_y
Python Conversion Script:
Copy
Ask AI
import jsonimport csvimport osfrom pathlib import Pathfrom typing import List, Tuple, Dict, Anydef extract_polygon_points(shape: Dict[str, Any]) -> List[Tuple[int, int]]: """Extract polygon points from a shape annotation.""" if 'points' not in shape: return [] points = [] for point in shape['points']: if len(point) >= 2: x, y = int(point[0]), int(point[1]) points.append((x, y)) return pointsdef polygon_to_bbox(points: List[Tuple[int, int]]) -> Tuple[int, int, int, int]: """Convert polygon points to bounding box coordinates.""" if not points: return (0, 0, 0, 0) x_coords = [point[0] for point in points] y_coords = [point[1] for point in points] min_x = min(x_coords) max_x = max(x_coords) min_y = min(y_coords) max_y = max(y_coords) col_x = min_x row_y = min_y width = max_x - min_x height = max_y - min_y return (col_x, row_y, width, height)def process_json_file(json_path: Path) -> List[Dict[str, Any]]: """Process a JSON annotation file and extract bounding boxes.""" try: with open(json_path, 'r', encoding='utf-8') as f: data = json.load(f) except (json.JSONDecodeError, FileNotFoundError) as e: print(f"Error reading {json_path}: {e}") return [] if 'shapes' not in data: print(f"No 'shapes' key found in {json_path}") return [] image_filename = data.get('imagePath', json_path.stem + '.png') bboxes = [] for shape in data['shapes']: if shape.get('shape_type') != 'polygon': continue label = shape.get('label', 'unknown') points = extract_polygon_points(shape) if not points: continue col_x, row_y, width, height = polygon_to_bbox(points) if width <= 0 or height <= 0: continue bbox_info = { 'filename': image_filename, 'col_x': col_x, 'row_y': row_y, 'width': width, 'height': height, 'label': label } bboxes.append(bbox_info) return bboxes# Paths to your foldersannotations_folder = "annotations"output_csv = "segmentation_annotations.csv"# Process all JSON files and convert to CSVall_bboxes = []json_files = list(Path(annotations_folder).glob('*.json'))for json_file in json_files: print(f"Processing: {json_file.name}") bboxes = process_json_file(json_file) all_bboxes.extend(bboxes) print(f" Found {len(bboxes)} bounding boxes")# Write to CSVfieldnames = ['filename', 'col_x', 'row_y', 'width', 'height', 'label']with open(output_csv, 'w', newline='', encoding='utf-8') as csvfile: writer = csv.DictWriter(csvfile, fieldnames=fieldnames) writer.writeheader() writer.writerows(all_bboxes)print(f"Conversion complete! Output saved to: {output_csv}")print(f"Total bounding boxes: {len(all_bboxes)}")
If your images are organized in folders where each subfolder name represents the class label, you can generate annotation files automatically.Folder Structure:
Once your annotation file is properly formatted, you can import it during dataset creation.Steps:
Upload your annotation file during dataset creation.
Files can be uploaded from your local machine or S3 bucket.
Ensure your file follows the required format and has the correct name.
The annotation file must be uploaded at the same time as your images. Visual Layer will process the annotations and make them available for filtering, search, and analysis.
Caption generation is one of the most time-consuming operations in Visual Layer’s dataset pipeline. When creating multiple datasets with the same images, you can extract and reuse caption data from previous pipeline runs.
Step 1: Create Initial DatasetCreate your first dataset with captioning enabled. After the pipeline completes, locate the parquet file:
Copy
Ask AI
# List recent datasetsls -lt /.vl/tmp/# Navigate to your dataset's metadatacd /.vl/tmp/[your-dataset-id]/input/metadata/# Verify the file existsls image_annotations.parquet
Step 2: Run the Extraction ScriptProcess the parquet file to extract captions:
Step 3: Copy to New Dataset DirectoryPlace the extracted parquet file in your new dataset directory alongside the images:
Copy
Ask AI
# Copy to new dataset locationcp image_annotations_processed.parquet /path/to/new-dataset/image_annotations.parquet
The parquet file must be named exactly image_annotations.parquet for Visual Layer to recognize it.
Step 4: Create New DatasetCreate your new dataset. Visual Layer will detect the existing image_annotations.parquet file and use the provided captions, completing much faster.
Filenames in the parquet file must be relative to the dataset directory location. Visual Layer looks for images relative to where the image_annotations.parquet file is located.Correct - Relative Paths: