Hive’s Text Moderation response format is an instantiation of the general classification response with additional fields to support the pattern-matching algorithms and the optional splitting of larger text inputs into sentence chunks.

Pattern-matching algorithm response:
The pattern-matching algorithm results for profanity are returned in the text_filter object. Similarly, pattern-matching algorithm results for PII are returned in the pii_entities object. Each pattern match will return an object describing matched substring in the value field, the start and end index of the pattern match in the start_index and end_index field, respectively, and the type (profanity, email, phone number, etc.).

Deep Learning model classifications:
The classified language is returned in the language object. Based on the classified language, and depending on the currently supported text moderation model classes, the moderated_classes field will indicate which classes have been moderated. If the classified language is "UNSUPPORTED", the moderated_classes array will be empty. The output array will contain the deep learning model results for each supported class.

Note: We are aware of an issue where start_index and end_index may be offset or misaligned relative to the text input in some cases. This can occur if the text input is significantly distorted with non-alphabetic characters, if pattern matching occurs on a subword, or if many characters are repeated on both ends of the text input. We are working to optimize our solution to this issue.

{
    "id": "a192fe10-dadf-11ec-996c-4db104663d75",
    "code": 200,
    "project_id": 12345,
    "user_id": 12345,
    "created_on": "2022-05-23T21:30:56.698Z",
    "status": [
        {
            "status": {
                "code": "0",
                "message": "SUCCESS"
            },
            "response": {
                "input": {
                    "hash": "...",
                    "inference_client_version": "...",
                    "model": "...",
                    "model_type": "TEXT_CLASSIFICATION",
                    "model_version": 1,
                    "text": "...",
                    "id": "a192fe10-dadf-11ec-996c-4db104663d75",
                    "created_on": "2022-05-23T21:30:56.497Z",
                    "user_id": 12345,
                    "project_id": 12345,
                    "charge": 0.003
                },
                "custom_classes": [],
                "text_filters": [
                    {
                        "value": "ASSHOLE",
                        "start_index": 17,
                        "end_index": 24,
                        "type": "profanity"
                    }
                ],
                "pii_entities": [
                    {
                        "value": "[email protected]",
                        "start_index": 39,
                        "end_index": 63,
                        "type": "Email Address"
                    },
                    {
                        "value": "123 YERBA BUENA LN, SAN FRANCISCO, CA 94103",
                        "start_index": 82,
                        "end_index": 125,
                        "type": "U.S. Mailing Address"
                    },
                    {
                        "value": "617-768-2274",
                        "start_index": 153,
                        "end_index": 165,
                        "type": "U.S. Phone Number"
                    },
                    {
                        "value": "+91-92342-43234",
                        "start_index": 191,
                        "end_index": 206,
                        "type": "International Phone Number"
                    }
                ],
                "urls": [
                    {
                        "value": "thehive.ai/projects/99999/settings",
                        "base_domain": "thehive.ai",
                        "start_index": 216,
                        "end_index": 258
                    }
                ],
                "language": "EN",
                "moderated_classes": [
                    "sexual",
                    "hate",
                    "violence",
                    "bullying",
                    "spam"
                ],
                "output": [
                    {
                        "time": 0,
                        "start_char_index": 0,
                        "end_char_index": 258,
                        "classes": [
                            {
                                "class": "spam",
                                "score": 3
                            },
                            {
                                "class": "sexual",
                                "score": 0
                            },
                            {
                                "class": "hate",
                                "score": 0
                            },
                            {
                                "class": "violence",
                                "score": 0
                            },
                            {
                                "class": "bullying",
                                "score": 3
                            }
                        ]
                    }
                ]
            }
        }
    ],
    "from_cache": false
}

Name

Description

classes

List of dictionaries of all output classes. Each dictionary contains the class name and the score. The scores range from 0 to 3 with 3 being the most severe.

class

Name of predicted class.

score

Score of predicted class.

start_char_index

First character processed.

end_char_index

Last character processed.