Overview
The Visual Layer bounding box annotation format is stored in CSV format, making it easy to process and integrate with various machine learning workflows. Each row represents a single bounding box annotation for an object within an image.CSV Format Structure
Column Name | Description |
---|---|
filename | The name of the image file containing the object. |
col_x | The x-coordinate (horizontal position) of the top-left corner of the bounding box. |
row_y | The y-coordinate (vertical position) of the top-left corner of the bounding box. |
width | The width of the bounding box, extending from col_x . |
height | The height of the bounding box, extending from row_y . |
label | The class or category of the detected object. |
Example Annotation
1.COCO (Common Objects in Context) Format
- Bounding Box Format:
[x_min, y_min, width, height]
- Details: Uses JSON format with a dictionary structure. The bounding box is represented by the top-left
(x_min, y_min)
coordinate, along withwidth
andheight
. - Example:
- Used In: Microsoft COCO dataset.
2.PASCAL VOC (Visual Object Classes) Format
- Bounding Box Format:
(x_min, y_min, x_max, y_max)
- Details: Stored in XML files using a
<bndbox>
tag. The bounding box is represented using absolute pixel values for top-left (x_min, y_min
) and bottom-right (x_max, y_max
) coordinates. - Example:
- Used In: VOC 2012 dataset.
3.YOLO (You Only Look Once) Format
- Bounding Box Format:
[x_center, y_center, width, height]
, normalized (0 to 1 scale) - Details: Uses plain text
.txt
files where each row represents an object asclass_id x_center y_center width height
, with values normalized relative to the image width and height. - Example:
- Used In: YOLO models and datasets.
4.TFRecord / TensorFlow Object Detection API Format
- Bounding Box Format: Normalized
(y_min, x_min, y_max, x_max)
(0 to 1 scale) - Details: Uses TFRecord format to store images and annotations in protocol buffers (Protobuf). Bounding boxes are stored as normalized float values within
.tfrecord
files. - Example(within a
.record
file, usually not human-readable): - Used In: TensorFlow Object Detection API.
5.LabelMe Format
- Bounding Box Format:
(x_min, y_min, x_max, y_max)
- Details: Uses JSON files similar to COCO but with a different structure, including polygon annotations. Bounding boxes are stored as
[x_min, y_min, x_max, y_max]
. - Example:
- Used In: LabelMe annotation tool.
###Summary of Bounding Box Formats
Format | Representation | Normalized? | File Type |
---|---|---|---|
COCO | [x_min, y_min, width, height] | No | .json |
VOC | (x_min, y_min, x_max, y_max) | No | .xml |
YOLO | [x_center, y_center, width, height] | Yes | .txt |
TFRecord | (y_min, x_min, y_max, x_max) | Yes | .tfrecord |
LabelMe | [[x_min, y_min], [x_max, y_max]] | No | .json |
Converting to Visual Layer format
- Convert from rom VOC2012 to Visual Layer Object Detection
- Convert from rom Coco to Visual Layer Object Detection