Models

Overview

AutoML models are custom fine-tuned machine learning models trained using your datasets. These models are trained on the AutoML platform using a dataset Snapshot and can be deployed to Hive Models projects, where they can be accessed just like all Hive pre-trained models.

See below for the full details of supported model types and currently offered base models.

Model Types

Model type is used during model creation to ensure snapshot compatibility and base model offerings. AutoML currently supports four model types.

Model Type	Description
Text Classification	Categorize text into one or multiple labels and classes.
Image Classification	Categorize images into one or multiple labels and classes.
Large Language Model (LLM)	Generate text from prompts, build custom chat bots, or complete complex classification tasks.
Object Detection	Detect the location and type of custom objects within images.

Base Models

Base models are the starting point upon which AutoML builds custom models. Starting from great base models leads to more efficient training and promotes strong custom model performance. Some base model types are better suited to certain tasks (like moderation or sentiment analysis), so it is recommended to experiment with multiple different base models and/or training methods.

AutoML supports various base models for each of the model types on the platform. For the full list of base models per model type, view their respective detail pages.

Training Methods

AutoML offers three model training methods—last layer fine tuning, low rank adaptation (LoRA), and full fine tuning. Each training type has tradeoffs, e.g. full fine tuning provides the most highly customized models but takes the longest to train. The list of supported training methods is below.

Training Method	Task Complexity	Training Speed	Supported Models
Last Layer	Moderate	Fastest	All classification models
LoRA	High	Very Fast	1. Text Classification v2 2. DeBERTa v3 3. LLM Instruct 8B v3 4. LLM Instruct 70B v3
Full Fine Tune	Very High	Fast	1. Text Classification v2 2. DeBERTa v3 3. Image Classification v2

Training Options

AutoML provides default training options that work well for most objectives. You can also customize each option to better suit your use case or just to experiment with different training configurations. For the full list of supported options and their definitions, view the Training Options page.

Model Evaluation

During and after model training, several metrics are available to track progress and measure the final performance of your custom model. The performance metrics currently supported on AutoML are available below.

Metric	Description
Balanced Accuracy	A percentage representing the average of recall and specificity. Higher values indicate better alignment for both positive and negative results. Balanced accuracy is calculated with the formula `1 / 2 * (Precision + Recall)`
F1 Score	A percentage representing the harmonic mean of the precision and recall. Higher values indicate better alignment of positive results. F1 score is calculated with the formula `2 * (Precision * Recall) / (Precision + Recall)`
Precision	A percentage representing the quality of positive results with respect to predicted classes at a specific confidence threshold. Higher values indicate fewer false positives. Precision is calculated with the formula `(True Positives) / (True Positives + False Positives)`
Recall	A percentage representing the quality of positive results with respect to actual classes at a specific confidence threshold. Higher values indicate fewer false negatives. Precision is calculated with the formula `(True Positives) / (True Positives + False Negatives)`
Specificity	A percentage representing the quality of negative results with respect to actual classes. Higher values indicate fewer false positives. Specificity is calculated with the formula `(True Negatives) / (True Negatives + False Positives)`
Loss	A positive float value that represents how well the predicted results match the expected results. Lower values indicate better alignment.
Precision Recall (PR) Curve	A graph representing how the precision changes with different recall values. Typically the precision decreases with higher recall and vice versa.
ROC Curve	The ROC Curve summarizes the tradeoff between recall (true positive rate) and false positive rate at various confidence thresholds. This curve helps you find a confidence threshold that meets the recall and false positive rate expectations for your use case.
Confusion Matrix	The confusion matrix is a table comparing actual labels against model predictions for each instance. It helps to visualize which classes the model predicts for each actual label.