Column Name | Description |
---|---|
filename | The name of the image file containing the object. |
col_x | The x-coordinate (horizontal position) of the top-left corner of the bounding box. |
row_y | The y-coordinate (vertical position) of the top-left corner of the bounding box. |
width | The width of the bounding box, extending from col_x . |
height | The height of the bounding box, extending from row_y . |
label | The class or category of the detected object. |
[x_min, y_min, width, height]
(x_min, y_min)
coordinate, along with width
and height
.(x_min, y_min, x_max, y_max)
<bndbox>
tag. The bounding box is represented using absolute pixel values for top-left (x_min, y_min
) and bottom-right (x_max, y_max
) coordinates.[x_center, y_center, width, height]
, normalized (0 to 1 scale).txt
files where each row represents an object as class_id x_center y_center width height
, with values normalized relative to the image width and height.(y_min, x_min, y_max, x_max)
(0 to 1 scale).tfrecord
files..record
file, usually not human-readable):
(x_min, y_min, x_max, y_max)
[x_min, y_min, x_max, y_max]
.Format | Representation | Normalized? | File Type |
---|---|---|---|
COCO | [x_min, y_min, width, height] | No | .json |
VOC | (x_min, y_min, x_max, y_max) | No | .xml |
YOLO | [x_center, y_center, width, height] | Yes | .txt |
TFRecord | (y_min, x_min, y_max, x_max) | Yes | .tfrecord |
LabelMe | [[x_min, y_min], [x_max, y_max]] | No | .json |