Understanding annotation export formats is essential for integrating labeled training data into your machine learning pipeline. Here's what you need to know about each format.
Common Data Annotation Formats
- COCO JSON - Industry standard for object detection, instance segmentation, and keypoint annotation. Used by Detectron2, MMDetection, and most modern frameworks.
- Pascal VOC XML - XML-based format widely used for object detection. Supported by TensorFlow Object Detection API.
- YOLO TXT - Simple text format with normalized coordinates. Ideal for Ultralytics YOLOv5/v8 training.
- TFRecord - TensorFlow's binary format for efficient data loading during model training.
- CreateML JSON - Apple's format for Core ML model training on iOS/macOS.
Choosing the Right Export Format
Match your annotation format to your ML framework and computer vision task:
- PyTorch + Detectron2 → COCO JSON format
- TensorFlow Object Detection → TFRecord or Pascal VOC
- Ultralytics YOLO → YOLO TXT format
- Custom training scripts → JSON or CSV export
Pro Tip: TigerLabel supports export to all major formats. You can also create custom export templates for proprietary ML pipelines.