> ## Documentation Index > Fetch the complete documentation index at: https://docs.visual-layer.com/llms.txt > Use this file to discover all available pages before exploring further. # Import Annotations > Import pre-existing annotation files into Visual Layer using supported formats including CSV, Parquet, JSON, COCO, and YOLO. Annotations are metadata labels that describe and categorize images or objects within images. They structure visual datasets and enable search, analysis, and AI training workflows. Annotations must be uploaded when creating a dataset and cannot be added later. ## Why Use Annotations Annotations improve data organization and model accuracy by providing structured labels for images and objects. They enable: * Efficient data retrieval through filtering and search. * Enhanced search capabilities within large datasets. * Object detection and classification tasks. * Improved model accuracy with high-quality labeled data. ## Common Annotation Types Visual Layer supports two primary annotation types: * **Image Annotations**: Assign class labels to entire images and categorize datasets by content. * **Object Annotations**: Label individual objects within images using bounding boxes and improve model accuracy. ## Supported Formats Visual Layer accepts annotation files in the following formats: **File Formats:** * Parquet and CSV for structured image and object annotations * JSON for COCO-format annotations * YOLO format with conversion * Segmentation masks with conversion * Custom folder-based structures with conversion **File Naming Requirements:** Your annotation file must be named exactly as one of the following: * `annotations.json` * `image_annotations.csv` * `object_annotations.csv` * `image_annotations.parquet` * `object_annotations.parquet` Visual Layer cannot process files that do not match these exact names. ## Preparing Annotation Files This section explains how to structure your annotation files for Visual Layer import. ### Image Annotations For full-image class labels, create a file named `image_annotations.csv` or `image_annotations.parquet`. Each row represents an image and its corresponding label. **Format:** | filename | label | | ---------------------------------------- | ------- | | IDX\_DF\_SIG21341\_PlasmasNeg.png | IDX\_DF | | IDX\_DF\_ALM00324\_PlasmasPos.png | IDX\_DF | | folder/IDX\_RC\_ALM04559\_PlasmasNeg.png | IDX\_RC | **Requirements:** * The `filename` column must contain relative paths * The `label` column assigns a class to the entire image * Multiple labels can be stored as a list: `['t-shirt', 'SKU12345']` * You may include a `caption` column for textual metadata **Example with Multiple Labels:** | filename | label | | --------------- | ------------------------ | | cool-tshirt.png | \["t-shirt", "SKU12345"] | | cool-pants.jpg | \["pants", "SKU231312"] | ### Object Annotations For object-level annotations, create a file named `object_annotations.csv` or `object_annotations.parquet`. Each row represents a detected object with bounding box coordinates and class label. **Format:** | filename | col\_x | row\_y | width | height | label | | -------------------------------------- | ------ | ------ | ----- | ------ | ----- | | Kitti/raw/training/image\_2/006149.png | 0 | 240 | 135 | 133 | Car | | Kitti/raw/training/image\_2/006149.png | 608 | 169 | 59 | 43 | Car | **Requirements:** * `col_x` and `row_y` define the top-left corner of the bounding box * `width` and `height` must be greater than zero * Each row corresponds to a single object within an image ### JSON Annotations Visual Layer supports COCO-format JSON annotations. Ensure the file is named `annotations.json`. **Example Format:** ```json theme={"theme":"monokai"} { "images": [ { "id": 1, "width": 640, "height": 480, "file_name": "image1.jpg" }, { "id": 2, "width": 800, "height": 600, "file_name": "image2.jpg" } ], "categories": [ { "id": 1, "name": "cat" }, { "id": 2, "name": "dog" }, { "id": 3, "name": "t-rex" } ], "annotations": [ { "id": 1, "image_id": 1, "category_id": 1, "bbox": [100, 100, 200, 200] }, { "id": 2, "image_id": 2, "category_id": 2, "bbox": [50, 50, 150, 150] }, { "id": 3, "image_id": 1, "category_id": 3 }, { "id": 4, "image_id": 2, "category_id": 3 } ] } ``` **Requirements:** * Bounding boxes follow the format `[col_x, row_y, width, height]` * `col_x` and `row_y` define the top-left corner * `width` and `height` must be greater than zero * Remove all comments before uploading ## Understanding Bounding Box Formats Different annotation tools and datasets use different bounding box coordinate systems. This section compares common formats and shows you how to convert them to Visual Layer's format. ### Visual Layer Format Visual Layer uses CSV format with the following structure: | Column Name | Description | | ----------- | ------------------------------------------------------------- | | `filename` | The name of the image file containing the object | | `col_x` | The x-coordinate (horizontal position) of the top-left corner | | `row_y` | The y-coordinate (vertical position) of the top-left corner | | `width` | The width of the bounding box, extending from `col_x` | | `height` | The height of the bounding box, extending from `row_y` | | `label` | The class or category of the detected object | **Example:** ```csv theme={"theme":"monokai"} filename,col_x,row_y,width,height,label image1.jpg,50,30,200,150,car image2.jpg,120,60,80,100,person image3.jpg,15,10,50,70,dog ``` ### Common Format Comparison | Format | Representation | Normalized? | File Type | | ---------------- | ------------------------------------- | ----------- | ----------- | | **Visual Layer** | `[col_x, row_y, width, height]` | No | `.csv` | | **COCO** | `[x_min, y_min, width, height]` | No | `.json` | | **VOC** | `(x_min, y_min, x_max, y_max)` | No | `.xml` | | **YOLO** | `[x_center, y_center, width, height]` | Yes | `.txt` | | **TFRecord** | `(y_min, x_min, y_max, x_max)` | Yes | `.tfrecord` | | **LabelMe** | `[[x_min, y_min], [x_max, y_max]]` | No | `.json` | Each format is optimized for different use cases. COCO and VOC are widely used in academic datasets, YOLO for real-time detection, TFRecord for TensorFlow-based training, and LabelMe for manual annotations. ## Converting from Other Formats If your annotations use a different format, you can convert them to Visual Layer's format using the scripts and guides below. ### Converting YOLO Annotations YOLO format stores annotations as normalized center coordinates. Each text file corresponds to an image and contains lines in the format: ``` ``` **Example:** ``` 0 0.5869140625 0.2412109375 0.021484375 0.044921875 0 0.8974609375 0.185546875 0.044921875 0.1015625 ``` **Conversion Process:** The script converts normalized coordinates to absolute pixel values and calculates top-left coordinates: ``` top_left_x = (center_x - width/2) * image_width top_left_y = (center_y - height/2) * image_height ``` **Python Conversion Script:** ```python theme={"theme":"monokai"} import os import csv import cv2 # Paths to your folders labels_folder = "output/labels" images_folder = "output/images" output_csv = "annotations.csv" # Open CSV for writing with open(output_csv, "w", newline="") as csvfile: writer = csv.writer(csvfile) writer.writerow(["filename", "col_x", "row_y", "width", "height", "label"]) # Process each label file for label_file in os.listdir(labels_folder): if label_file.endswith(".txt"): image_base = os.path.splitext(label_file)[0] image_filename = image_base + ".png" image_path = os.path.join(images_folder, image_filename) label_file_path = os.path.join(labels_folder, label_file) if not os.path.exists(image_path): print(f"Image not found for {label_file}") continue # Read image to get dimensions image = cv2.imread(image_path) if image is None: print(f"Failed to load image {image_path}") continue h_img, w_img = image.shape[:2] # Open and read the label file with open(label_file_path, "r") as f: lines = f.readlines() for line in lines: parts = line.strip().split() if len(parts) != 5: continue class_id, norm_cx, norm_cy, norm_w, norm_h = map(float, parts) # Convert normalized values to absolute pixel values abs_cx = norm_cx * w_img abs_cy = norm_cy * h_img abs_w = norm_w * w_img abs_h = norm_h * h_img # Calculate the top-left corner coordinates top_left_x = abs_cx - abs_w / 2 top_left_y = abs_cy - abs_h / 2 label_str = "ship" if class_id == 0 else str(int(class_id)) writer.writerow([ image_filename, int(top_left_x), int(top_left_y), int(abs_w), int(abs_h), label_str ]) ``` ### Converting Segmentation Masks Segmentation masks use polygon coordinates to define object boundaries. You can convert these to bounding boxes by finding the minimum and maximum x,y values. **Example Segmentation Mask Format:** ```json theme={"theme":"monokai"} { "version": "4.5.6", "shapes": [ { "label": "QSBD", "points": [ [64, 10], [64, 15], [67, 15], [68, 14], [68, 10] ], "shape_type": "polygon" } ], "imagePath": "example_image.png" } ``` **Conversion Logic:** ``` min_x = minimum x-coordinate from all points max_x = maximum x-coordinate from all points min_y = minimum y-coordinate from all points max_y = maximum y-coordinate from all points col_x = min_x (left edge) row_y = min_y (top edge) width = max_x - min_x height = max_y - min_y ``` **Python Conversion Script:** ```python theme={"theme":"monokai"} import json import csv import os from pathlib import Path from typing import List, Tuple, Dict, Any def extract_polygon_points(shape: Dict[str, Any]) -> List[Tuple[int, int]]: """Extract polygon points from a shape annotation.""" if 'points' not in shape: return [] points = [] for point in shape['points']: if len(point) >= 2: x, y = int(point[0]), int(point[1]) points.append((x, y)) return points def polygon_to_bbox(points: List[Tuple[int, int]]) -> Tuple[int, int, int, int]: """Convert polygon points to bounding box coordinates.""" if not points: return (0, 0, 0, 0) x_coords = [point[0] for point in points] y_coords = [point[1] for point in points] min_x = min(x_coords) max_x = max(x_coords) min_y = min(y_coords) max_y = max(y_coords) col_x = min_x row_y = min_y width = max_x - min_x height = max_y - min_y return (col_x, row_y, width, height) def process_json_file(json_path: Path) -> List[Dict[str, Any]]: """Process a JSON annotation file and extract bounding boxes.""" try: with open(json_path, 'r', encoding='utf-8') as f: data = json.load(f) except (json.JSONDecodeError, FileNotFoundError) as e: print(f"Error reading {json_path}: {e}") return [] if 'shapes' not in data: print(f"No 'shapes' key found in {json_path}") return [] image_filename = data.get('imagePath', json_path.stem + '.png') bboxes = [] for shape in data['shapes']: if shape.get('shape_type') != 'polygon': continue label = shape.get('label', 'unknown') points = extract_polygon_points(shape) if not points: continue col_x, row_y, width, height = polygon_to_bbox(points) if width <= 0 or height <= 0: continue bbox_info = { 'filename': image_filename, 'col_x': col_x, 'row_y': row_y, 'width': width, 'height': height, 'label': label } bboxes.append(bbox_info) return bboxes # Paths to your folders annotations_folder = "annotations" output_csv = "segmentation_annotations.csv" # Process all JSON files and convert to CSV all_bboxes = [] json_files = list(Path(annotations_folder).glob('*.json')) for json_file in json_files: print(f"Processing: {json_file.name}") bboxes = process_json_file(json_file) all_bboxes.extend(bboxes) print(f" Found {len(bboxes)} bounding boxes") # Write to CSV fieldnames = ['filename', 'col_x', 'row_y', 'width', 'height', 'label'] with open(output_csv, 'w', newline='', encoding='utf-8') as csvfile: writer = csv.DictWriter(csvfile, fieldnames=fieldnames) writer.writeheader() writer.writerows(all_bboxes) print(f"Conversion complete! Output saved to: {output_csv}") print(f"Total bounding boxes: {len(all_bboxes)}") ``` ### Creating Annotations from Folder Structure If your images are organized in folders where each subfolder name represents the class label, you can generate annotation files automatically. **Folder Structure:** ``` dataset/ ├── Ulcer/ │ ├── image1.bmp │ └── image2.bmp ├── Normal/ │ ├── image3.bmp │ └── image4.bmp └── AVM/ ├── image5.bmp └── image6.bmp ``` **Python Script:** ```python theme={"theme":"monokai"} import os import csv def create_annotation_csv(root_dir, output_csv): rows = [] for subdir in os.listdir(root_dir): subdir_path = os.path.join(root_dir, subdir) if os.path.isdir(subdir_path): label = subdir for filename in os.listdir(subdir_path): file_path = os.path.join(subdir_path, filename) if os.path.isfile(file_path): relative_path = os.path.join(subdir, filename) rows.append([relative_path, label]) with open(output_csv, mode='w', newline='') as csv_file: writer = csv.writer(csv_file) writer.writerow(['filename', 'label']) writer.writerows(rows) print(f"CSV file '{output_csv}' created successfully.") if __name__ == "__main__": root_directory = "." output_csv_file = "image_annotations.csv" create_annotation_csv(root_directory, output_csv_file) ``` **Example Output:** ```csv theme={"theme":"monokai"} filename,label Ulcer/Ulcer_2024-08-07-08-28-10_81061.bmp,Ulcer Ulcer/Ulcer_2024-08-07-08-29-37_82025.bmp,Ulcer Normal/Normal_2024-08-07-08-30-15_12345.bmp,Normal AVM/AVM_2024-08-07-08-31-22_54321.bmp,AVM ``` ## Importing Annotations into Visual Layer Once your annotation file is properly formatted, you can import it during dataset creation. **Steps:** 1. Upload your annotation file during dataset creation. 2. Files can be uploaded from your local machine or S3 bucket. 3. Ensure your file follows the required format and has the correct name. The annotation file must be uploaded at the same time as your images. Visual Layer will process the annotations and make them available for filtering, search, and analysis. ## Reusing Caption Data Caption generation is one of the most time-consuming operations in Visual Layer's dataset pipeline. When creating multiple datasets with the same images, you can extract and reuse caption data from previous pipeline runs. ### Benefits Reusing caption data allows you to: * Skip caption generation on subsequent dataset creations. * Maintain consistent captions across multiple datasets. * Reduce processing time significantly. This approach is ideal when you need to create multiple datasets or dataset versions using the same images but with different configurations. ### How Caption Data Is Stored After running a dataset pipeline, Visual Layer stores processed data in: ``` /.vl/tmp/[dataset-id]/input/metadata/image_annotations.parquet ``` This parquet file contains all the caption data you need to reuse. ### Extraction Process The extraction script processes Visual Layer's internal parquet files to create a clean annotation file: 1. Extracts relevant columns: `filename` and `caption` 2. Removes system paths like `/hostfs`, `/mnt`, etc. 3. Creates relative paths by converting absolute paths to relative filenames 4. Outputs clean parquet file named `image_annotations.parquet` **Script Location:** The complete Python script is available in the Useful Scripts guide. Click here to view and copy the code. ### Workflow **Step 1: Create Initial Dataset** Create your first dataset with captioning enabled. After the pipeline completes, locate the parquet file: ```bash theme={"theme":"monokai"} # List recent datasets ls -lt /.vl/tmp/ # Navigate to the dataset metadata directory cd /.vl/tmp/[your-dataset-id]/input/metadata/ # Verify the file exists ls image_annotations.parquet ``` **Step 2: Run the Extraction Script** Process the parquet file to extract captions: ```bash theme={"theme":"monokai"} # Basic usage python3 process_annotations.py /.vl/tmp/[dataset-id]/input/metadata/image_annotations.parquet # Specify custom output location python3 process_annotations.py /.vl/tmp/[dataset-id]/input/metadata/image_annotations.parquet \ -o /path/to/new-dataset/image_annotations.parquet # Custom prefix removal python3 process_annotations.py input.parquet --prefix /custom/prefix/to/remove ``` **Step 3: Copy to New Dataset Directory** Place the extracted parquet file in your new dataset directory alongside the images: ```bash theme={"theme":"monokai"} # Copy to new dataset location cp image_annotations_processed.parquet /path/to/new-dataset/image_annotations.parquet ``` The parquet file must be named exactly `image_annotations.parquet` for Visual Layer to recognize it. **Step 4: Create New Dataset** Create your new dataset. Visual Layer will detect the existing `image_annotations.parquet` file and use the provided captions, completing much faster. ### Understanding Relative Paths Filenames in the parquet file must be relative to the dataset directory location. Visual Layer looks for images relative to where the `image_annotations.parquet` file is located. **Correct - Relative Paths:** ``` Dataset directory: /any/path/dataset/ ├── image_annotations.parquet ├── dog_1.jpg ├── dog_2.jpg └── dog_3.jpg Filenames in parquet: - dog_1.jpg - dog_2.jpg - dog_3.jpg ``` **With Subdirectory:** ``` Dataset directory: /any/path/dataset/ ├── image_annotations.parquet └── images/ ├── dog_1.jpg ├── dog_2.jpg └── dog_3.jpg Filenames in parquet: - images/dog_1.jpg - images/dog_2.jpg - images/dog_3.jpg ``` ### Troubleshooting **Images Not Found:** If Visual Layer cannot find your images, verify: 1. Parquet file is in the same directory as images 2. Filenames match exactly (case-sensitive) 3. Paths in parquet are relative, not absolute **Captions Not Being Used:** If Visual Layer is still generating captions: 1. Verify filename is exactly `image_annotations.parquet` 2. Ensure file is in the correct location relative to images 3. Check that parquet file has both `filename` and `caption` columns ## Next Steps Now that you understand how to import annotations, you can create and explore datasets with rich metadata. Create your first dataset with annotations Use annotations to filter and analyze your data Export annotated datasets for downstream use