Training Custom Pre-Labeling Models

15 min read

Train project-specific machine learning models on your labeled data for more accurate pre-labeling than generic foundation models.

When to Train Custom Pre-Labeling Models

  • Foundation models aren't accurate enough for your specific domain
  • You have sufficient labeled training data (typically 1000+ examples)
  • You'll annotate significantly more data of the same type
  • Your label ontology is specialized or proprietary

Custom Model Training Workflow

  1. Export existing labeled data from TigerLabel in training format
  2. Train model using your preferred ML framework (PyTorch, TensorFlow)
  3. Evaluate model on held-out validation set
  4. Upload model to TigerLabel Model Registry
  5. Configure as pre-labeling model for your annotation project
  6. Continuously improve with new labeled data (active learning loop)