Models

A guide to creating a model with our AutoML platform

Overview

An AutoML model is a custom machine learning algorithm trained using your dataset. Models are created using Snapshots. After a model has completed training, you can deploy that model to instantly begin using it by submitting API requests as you would all other Hive models.

The AutoML platform supports the following model types: Text Classification, Image Classification, and Large Language Models.

Create a Model

To create a model, go to the Models dashboard page and click the Create New Model button.

The `Create New Model` button sits at the top right of the `Models` page.

The Create New Model button sits at the top right of the Models page.

📘

No dataset yet? No problem!

If you don't have a dataset yet but would like to explore the model training process, you can use an example dataset instead. To do this, select Example when you're prompted to select a snapshot for training.

Model Types

On the create model form, you will first need to select the Model Type. This type of model indicates how the model will be used. There are three different types to choose from:

Model TypeDescription
Text ClassificationCategorize text into multiple labels and classes.
Image ClassificationCategorize image content into multiple labels and classes.
Large Language ModelBuild your own LLMs for text generation, chat, and more.

The "Model Type" will determine which Snapshots and Base Models can be used for training. The Base Model represents the Hive model that will be used as a starting point for your training. Which Base Model is selected can have a huge impact on accuracy so we recommend testing out several options with the same snapshot to compare performance. A list of supported Base Models is found below:

Base ModelDescription
Text ClassificationUse custom labels and classes to classify text information.
Text ModerationCustomize the Hive Text Moderation model with additional data for existing categories or add your own brand-new categories to be used alongside our built-in ones.
Image ClassificationUse custom labels and classes to classify image content.
Vision ModerationCustomize the Hive Visual Moderation model with additional data for existing categories or add your own brand-new categories to be used alongside our built-in ones.
Large Language ModelGenerate freeform text based on your provided prompts and completions.

For more information about each Base Model, please see the associated page linked in the table above. To learn more or request the addition of a particular type of model, please contact [email protected].

Model Training Options

Before you create your model, you'll have various options for how to structure your training. All of these settings come with reasonable defaults, but we recommend that you tweak them based on your use case for optimal model performance.

OptionDescriptionInput
Max EpochsThe maximum number of times a training will cycle through all of the training data before terminating. Larger values typically result in higher accuracy.A positive integer value that varies based on the base model selected.
Model Selection RuleThe criteria used to determine the optimal epoch.Best Balanced Accuracy: Create the model by selecting the epoch that has the best balanced accuracy. Balanced accuracy is a classification metric ranging from 0 to 1, representing the average recall of each class. Higher values indicate better alignment.

Best F1 Score: Create the model by selecting the epoch that has the best F1 score. An F1 score is a classification metric ranging from 0 to 1, representing accuracy (the harmonic mean of precision and recall). Higher values indicate better alignment.

Best Loss: Create the model by selecting the epoch that has the best loss. Loss is a positive float metric that represents how well the predicted results match the expected results. Lower values indicate better alignment. This metric is applicable for LLMs only.
Model Selection Label (Classification Models Only)Represents which label to use in order to calculate whichever metric you select for your Model Selection Rule. One of the heads (classification categories) from the snapshot used for training.
Early StoppingThe number of epochs before we terminate training if the selected evaluation metric has not improved, even if the training has not yet reached Max Epochs.An integer value between 1 and the value chosen for Max Epochs.
Max Invalid Row PercentThe maximum percentage of rows that we permit to be invalid before failing the model training.An integer value from 1 - 100.
Max Tokens (LLMs Only)The maximum token length for the combined prompt and completion input examples. Examples that exceed this value will be ignored.An integer value from 0 - 4096.

Evaluate Performance

After the model is trained, we provide several metrics to help you evaluate the performance of your model. For example, the following image show the performance analysis of a Text Classification model.

Evaluation metrics for an AutoML project as shown after training has completed.

The page for a completed text classification model displays several metrics you can use in order to evaluate it.

Each of the evaluation metrics are defined below.

Evaluation MetricDefinitionModel Types
PrecisionA percentage representing the quality of positive results with respect to predicted classes at a specific confidence threshold. Higher values indicate fewer false positives. It is calculated with the formula: (true_positives)/(true_positives+false_positives)Classification
RecallA percentage representing the quality of positive results with respect to actual classes at a specific confidence threshold. Higher values indicate fewer false negatives. It is calculated with the formula: (true_positives)/(true_positives+false_negatives)Classification
SpecificityA percentage representing the quality of negative results with respect to actual classes. Higher values indicate fewer false positives. It is calculated with the formula:
(true_negatives)/(true_negatives+false_positives)
Classification
Balanced AccuracyA percentage representing the average of recall and specificity. Higher values indicate better alignment for both positive and negative results. It is calculated with the formula: 1/2(precision+recall)Classification
F1 ScoreA percentage representing the harmonic mean of the precision and recall. Higher values indicate better alignment of positive results. It is calculated with the formula: 2*(precision*recall)/(precision+recall)Classification
False Positive RateA percentage representing the probability of false positives with respect to actual classes. Higher values indicate more false positives. It is calculated with the formula:
(false_positives)/(false_positives+true_negatives)
Classification
LossA positive float metric that represents how well the predicted results match the expected results. Lower values indicate better alignment.Large Language Models
Precision / Recall (PR) CurveA graph representing how the precision changes with different recall values. Typically the precision decreases with higher recall and vice versa.Classification
ROC CurveThe ROC Curve summarizes the tradeoff between recall (true positive rate) and false positive rate at various confidence thresholds. This curve helps you find a confidence threshold that meets the recall and false positive rate expectations for your use case.Classification
Confusion MatrixThe confusion matrix is a table comparing actual labels against model predictions for each instance. It helps to visualize which classes the model predicts for each actual label.Classification

If you would like to retrain your model, you can click the blue Update Model button in the top right corner of the screen and start the training process again. This will create a new version of your model.