Data Annotation Export Formats Explained

8 min read

Understanding annotation export formats is essential for integrating labeled training data into your machine learning pipeline. Here's what you need to know about each format.

Common Data Annotation Formats

  • COCO JSON - Industry standard for object detection, instance segmentation, and keypoint annotation. Used by Detectron2, MMDetection, and most modern frameworks.
  • Pascal VOC XML - XML-based format widely used for object detection. Supported by TensorFlow Object Detection API.
  • YOLO TXT - Simple text format with normalized coordinates. Ideal for Ultralytics YOLOv5/v8 training.
  • TFRecord - TensorFlow's binary format for efficient data loading during model training.
  • CreateML JSON - Apple's format for Core ML model training on iOS/macOS.

Choosing the Right Export Format

Match your annotation format to your ML framework and computer vision task:

  • PyTorch + Detectron2 → COCO JSON format
  • TensorFlow Object Detection → TFRecord or Pascal VOC
  • Ultralytics YOLO → YOLO TXT format
  • Custom training scripts → JSON or CSV export
Pro Tip: TigerLabel supports export to all major formats. You can also create custom export templates for proprietary ML pipelines.