Text Moderation Explanations
A guide to our Text Moderation Explanations API
Overview
Text Moderation Explanations is a new feature that explains why a given text string was assigned a certain score by our Text Moderation model. The API takes in three inputs: a text string, its class label, and the score it was assigned. The output is a text string that explains why the original input text was given that score relative to its class.
Supported Languages
We currently support the following languages for this feature:
- English
- Hindi
- Spanish
- Portuguese
- French
- Arabic
- Italian
- German
If you are unsure if your required language is supported/want to request an additional language, please reach out to our sales team ([email protected]).
Request Format
Below are the input fields for a Text Moderation Explanations request.
class
: The class that Text Moderation assigned to the original input text. The possible classes are: “sexual”, “bullying”, “hate”, “violence”.
severity
: The original score that Text Moderation assigned to the input text. It is an integer value ranging from 0 (benign) to 3 (most severe), inclusive.
text
: The input text string, whose severity (relative to its class) the user would like explained. The maximum amount of characters is 1024.
Here is an example of a cURL request using the following format:
curl -X POST "https://api.thehive.ai/api/v2/task/sync" \
-H "Authorization: Token <TOKEN>" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d 'text_data=You are the worst person I have ever met' \
-d 'options={"class": "bullying", "severity": "2"}'
Response
Below are the output fields for a response.
text
: The output text string. An explanation for why the input text string received its designated severity.
For an annotated sample response, please refer to our API Reference page.
Updated 2 months ago