Hive’s Text Moderation response format is an instantiation of the general classification response with additional fields to support the pattern-matching algorithms and the optional splitting of larger text inputs into sentence chunks.

Pattern-matching algorithm response:
The pattern-matching algorithm results for profanity are returned in the text_filter object. Similarly, pattern-matching algorithm results for PII are returned in the pii_entities object. Each pattern match will return an object describing matched substring in the value field, the start and end index of the pattern match in the start_index and end_index field, respectively, and the type (profanity, email, phone number, etc.).

Deep Learning model classifications:
The classified language is returned in the language object. Based on the classified language, and depending on the currently supported text moderation model classes, the moderated_classes field will indicate which classes have been moderated. If the classified language is "UNSUPPORTED", the moderated_classes array will be empty. The output array will contain the deep learning model results for each supported class.

Note: We are aware of an issue where start_index and end_index may be offset or misaligned relative to the text input in some cases. This can occur if the text input is significantly distorted with non-alphabetic characters, if pattern matching occurs on a subword, or if many characters are repeated on both ends of the text input. We are working to optimize our solution to this issue.

{
    "task_id": "f4c65ce0-e849-11ec-acb5-f57b28c51f09",
    "created_on": "2022-06-09T23:14:47.854Z",
    "moderated_on": "2022-06-09T23:14:51.899Z",
    "moderated_by": "classifier",
    "task_units": 1,
    "charge": 0.003,
    "state": "finished",
    "history": [
        {
            "state": "created",
            "event_on": "2022-06-09T23:14:47.854Z"
        },
        {
            "state": "downloaded",
            "event_on": "2022-06-09T23:14:48.311Z"
        },
        {
            "state": "finished",
            "event_on": "2022-06-09T23:14:51.899Z"
        }
    ],
    "status": [
        {
            "status": {
                "code": "0",
                "message": "SUCCESS"
            },
            "response": {
                "input": {
                    "hash": "5359e0232d6353cb2959441af394caf3",
                    "inference_client_version": "5.3.7",
                    "model": "...",
                    "model_type": "TEXT_CLASSIFICATION",
                    "model_version": 1,
                    "text": "...",
                    "id": "f4c65ce0-e849-11ec-acb5-f57b28c51f09",
                    "created_on": "2022-06-09T23:14:47.854Z",
                    "user_id": 123,
                    "project_id": 123,
                    "charge": 0.003
                },
                "custom_classes": [
                    {
                        "value": "XXX",
                        "start_index": 6,
                        "end_index": 9,
                        "class": "test_custom_classes"
                    }
                ],
                "text_filters": [
                    {
                        "value": "ASSHOLE",
                        "start_index": 16,
                        "end_index": 23,
                        "type": "profanity"
                    },
                    {
                        "value": "PORN",
                        "start_index": 276,
                        "end_index": 280,
                        "type": "profanity"
                    },
                    {
                        "value": "NIGGER",
                        "start_index": 301,
                        "end_index": 307,
                        "type": "profanity"
                    }
                ],
                "pii_entities": [
                    {
                        "value": "[email protected]",
                        "start_index": 38,
                        "end_index": 62,
                        "type": "Email Address"
                    },
                    {
                        "value": "123 YERBA BUENA LN, SAN FRANCISCO, CA 94103",
                        "start_index": 81,
                        "end_index": 124,
                        "type": "U.S. Mailing Address"
                    },
                    {
                        "value": "617-768-2274",
                        "start_index": 152,
                        "end_index": 164,
                        "type": "U.S. Phone Number"
                    },
                    {
                        "value": "+91-92342-43234",
                        "start_index": 190,
                        "end_index": 205,
                        "type": "International Phone Number"
                    }
                ],
                "urls": [
                    {
                        "value": "thehive.ai/projects/99999/settings",
                        "base_domain": "thehive.ai",
                        "start_index": 215,
                        "end_index": 257
                    }
                ],
                "language": "EN",
                "moderated_classes": [
                    "sexual",
                    "hate",
                    "violence",
                    "bullying",
                    "spam",
                    "gibberish",
                    "child_exploitation",
                    "phone_number"
                ],
                "output": [
                    {
                        "time": 0,
                        "start_char_index": 0,
                        "end_char_index": 307,
                        "classes": [
                            {
                                "class": "spam",
                                "score": 3
                            },
                            {
                                "class": "sexual",
                                "score": 2
                            },
                            {
                                "class": "hate",
                                "score": 1
                            },
                            {
                                "class": "violence",
                                "score": 0
                            },
                            {
                                "class": "bullying",
                                "score": 3
                            },
                            {
                                "class": "promotions",
                                "score": 0
                            },
                            {
                                "class": "gibberish",
                                "score": 0
                            },
                            {
                                "class": "child_exploitation",
                                "score": 3
                            },
                            {
                                "class": "phone_number",
                                "score": 3
                            }
                        ]
                    }
                ]
            }
        }
    ],
    "hash": "c853276aa2face551aee21e203063bf882312ec3f3cb18ad40d5fd88d29f83ed",
    "callback_metadata": null,
    "internal_metadata": null,
    "callback_url": null,
    "callback_status_code": null,
    "callback_status_message": null,
    "callback_retry_count": null,
    "original_filename": null,
    "original_url": null,
    "label_data": null,
    "task_description_translations": null,
    "text_data": "...",
    "task_type": "real",
    "focus": null,
    "static_objects": null,
    "status_to_edit": null,
    "media_objects": [],
    "extra_media_objects": null,
    "media_url": null,
    "media_type": null,
    "media_deleted": false,
    "result_deleted": false,
    "invalidated_previous_task_id": null,
    "invalidated_next_task_id": null,
    "task_tags": null,
    "project_override": null,
    "subtask_data": null,
    "task_options": null,
    "error": null,
    "error_code": null,
    "pipeline_data": {
        "pipeline_task": null,
        "component_tasks": null
    },
    "confirmation_data": {
        "pa_task": null,
        "confirmation_tasks": null
    }
}

Name

Description

classes

List of dictionaries of all output classes. Each dictionary contains the class name and the score. The scores range from 0 to 3 with 3 being the most severe.

class

Name of predicted class.

score

Score of predicted class.

start_char_index

First character processed.

end_char_index

Last character processed.