Text Classification - AutoML Training
A guide on training your custom text classification models using AutoML
Overview
Text classification models allow you to predict a category or class for a piece of text content. Popular uses for text classification models include content moderation, topic labelling, sentiment analysis, and more.
Which Base Model Should I Use?
Base Model | Max Tokens | Use Case | Hive Classes | Dataset Requirements |
---|---|---|---|---|
Text Classification v2 | 512 | Well-suited for most text classification tasks | -- | Minimum 10 examples per class |
Text Moderation v2 | 512 | Best for moderation tasks—leverages pre-trained Hive moderation labels | Included | Minimum 50 examples per class |
DeBERTa v3 | 512 | Best for sentiment analysis or very complex/nuanced classification | -- | Minimum 10 examples per class |
Longformer v1 | 1024 | Best for long-form text inputs that are too long for other models | -- | Minimum 10 examples per class |
Hive Text Moderation Labels and Classes
If you select the Hive Text Moderation base model, the model output will always contain all Hive Text Moderation categories. A full list of moderation labels can be found in the Text Moderation documentation.
Updated about 1 month ago