Text Classification

An guide to AutoML text classification models

Overview

Text classification models allow you to predict a category or class for a piece of text content. Popular uses for text classification models include content moderation, topic labelling, sentiment analysis, and more.

Which Base Model Should I Use?

Base ModelLatencyMax TokensContains Hive ClassesUse CaseDataset Validation Requirement
Hive Text ClassificationLow512 per requestNoIf you want to train a text classification tool from scratch, you should use this base model.1. Each class requires a minimum of 10 examples.

2. Each row of text data has a maximum length restriction of 2048 characters.
Hive Text ModerationLow512 per requestYesIf you're interested in moderation use cases, you should select this base model. This option will fine-tune our existing Hive Text Moderation model with the additional data you provide, allowing you to customize our pre-made heads or add in your own new ones. The resulting model will include all Hive Text Moderation heads as well as any you choose to create.1. Each class requires a minimum of 50 examples.

2. Each row of text data has a maximum length restriction of 2048 characters.

Hive Text Moderation Labels and Classes

If you select the Hive Text Moderation base model, the model output will always contain all Hive Text Moderation categories. A full list of these pre-made labels is shown below:

LabelsClasses
sexual_v20, 1, 2, 3
hate_v30, 1, 2, 3
bullying_v20, 1, 2, 3
threat_v20, 1, 2, 3
child_exploitation0, 1
promotion0, 1
gibberish0, 1
phone_number0, 1
drugs0,1,2,3
child_safety0, 1
self_harm0, 1
weapons0, 1, 2, 3
redirection0, 1

You can find a more detailed list of these classes and their definitions here.