Overview
The Visual Layer bounding box annotation format is stored in CSV format, making it easy to process and integrate with various machine learning workflows. Each row represents a single bounding box annotation for an object within an image.CSV Format Structure
| Column Name | Description |
|---|---|
filename | The name of the image file containing the object. |
col_x | The x-coordinate (horizontal position) of the top-left corner of the bounding box. |
row_y | The y-coordinate (vertical position) of the top-left corner of the bounding box. |
width | The width of the bounding box, extending from col_x. |
height | The height of the bounding box, extending from row_y. |
label | The class or category of the detected object. |
Example Annotation
1.COCO (Common Objects in Context) Format
- Bounding Box Format:
[x_min, y_min, width, height] - Details: Uses JSON format with a dictionary structure. The bounding box is represented by the top-left
(x_min, y_min)coordinate, along withwidthandheight. - Example:
- Used In: Microsoft COCO dataset.
2.PASCAL VOC (Visual Object Classes) Format
- Bounding Box Format:
(x_min, y_min, x_max, y_max) - Details: Stored in XML files using a
<bndbox>tag. The bounding box is represented using absolute pixel values for top-left (x_min, y_min) and bottom-right (x_max, y_max) coordinates. - Example:
- Used In: VOC 2012 dataset.
3.YOLO (You Only Look Once) Format
- Bounding Box Format:
[x_center, y_center, width, height], normalized (0 to 1 scale) - Details: Uses plain text
.txtfiles where each row represents an object asclass_id x_center y_center width height, with values normalized relative to the image width and height. - Example:
- Used In: YOLO models and datasets.
4.TFRecord / TensorFlow Object Detection API Format
- Bounding Box Format: Normalized
(y_min, x_min, y_max, x_max)(0 to 1 scale) - Details: Uses TFRecord format to store images and annotations in protocol buffers (Protobuf). Bounding boxes are stored as normalized float values within
.tfrecordfiles. - Example(within a
.recordfile, usually not human-readable): - Used In: TensorFlow Object Detection API.
5.LabelMe Format
- Bounding Box Format:
(x_min, y_min, x_max, y_max) - Details: Uses JSON files similar to COCO but with a different structure, including polygon annotations. Bounding boxes are stored as
[x_min, y_min, x_max, y_max]. - Example:
- Used In: LabelMe annotation tool.
###Summary of Bounding Box Formats
| Format | Representation | Normalized? | File Type |
|---|---|---|---|
| COCO | [x_min, y_min, width, height] | No | .json |
| VOC | (x_min, y_min, x_max, y_max) | No | .xml |
| YOLO | [x_center, y_center, width, height] | Yes | .txt |
| TFRecord | (y_min, x_min, y_max, x_max) | Yes | .tfrecord |
| LabelMe | [[x_min, y_min], [x_max, y_max]] | No | .json |
Converting to Visual Layer format
- Convert from rom VOC2012 to Visual Layer Object Detection
- Convert from rom Coco to Visual Layer Object Detection