Hive’s speech moderation model outputs a transcript and then a set of classifications, timestamps and indexes for each sentence in the transcript.

{
    "id": "5ce602f0-5b07-11ed-80f8-138a369a2201",
    "code": 200,
    "project_id": 41565,
    "user_id": 3121654,
    "created_on": "2022-11-02T23:37:53.725Z",
    "status": [
        {
            "status": {
                "code": "0",
                "message": "SUCCESS"
            },
            "response": {
                "input": {
                    "id": "5ce602f0-5b07-11ed-80f8-138a369a2201",
                    "created_on": "2022-11-02T23:37:49.983Z",
                    "user_id": 3121654,
                    "project_id": 41565,
                    "charge": 0.10200000000000001,
                    "model": "multilingual_v1",
                    "model_version": 2,
                    "model_type": "TRANSCRIPTION",
                    "hash": "0f2701494d8d099e32076b97602af706",
                    "media": {
                        "url": null,
                        "filename": "Building.m4a",
                        "type": "AUDIO",
                        "mime_type": "m4a",
                        "mimetype": "audio/m4a",
                        "duration": 33.144479
                    }
                },
                "custom_classes": [],
                "text_filters": [],
                "pii_entities": [],
                "language": "EN",
                "moderated_classes": [
                    "sexual",
                    "violence",
                    "hate",
                    "bullying"
                ],
                "output": [
                    {
                        "transcript": "And so the Woy Thing bot is that, like, I think, I think the Alpha logic looks like you have, like, each time sap ple have a series of for it. Okay. And this is if you have a major. Right, right. And you speak the into a into a language model athen. No, do put that into a actually. Okay, I say, Yeah, because I was wondering why, like, you know, Tet times, for example, like to be kind of the biggest thing, you know, B er, like.",
                        "classifications": [
                            {
                                "classes": [
                                    {
                                        "class": "sexual",
                                        "score": 0
                                    },
                                    {
                                        "class": "hate",
                                        "score": 0
                                    },
                                    {
                                        "class": "violence",
                                        "score": 0
                                    },
                                    {
                                        "class": "bullying",
                                        "score": 0
                                    }
                                ],
                                "text": "And so the Woy Thing bot is that, like, I think, I think the Alpha logic looks like you have, like, each time sap ple have a series of for it.",
                                "custom_classes": [],
                                "text_filters": [],
                                "pii_entities": [],
                                "start_timestamp": 1.52,
                                "end_timestamp": 8.42,
                                "start_char_index": 0,
                                "end_char_index": 142
                            },
                            {
                                "classes": [
                                    {
                                        "class": "sexual",
                                        "score": 0
                                    },
                                    {
                                        "class": "hate",
                                        "score": 0
                                    },
                                    {
                                        "class": "violence",
                                        "score": 0
                                    },
                                    {
                                        "class": "bullying",
                                        "score": 0
                                    }
                                ],
                                "text": "Okay, I say, Yeah, because I was wondering why, like, you know, Tet times, for example, like to be kind of the biggest thing, you know, B er, like.",
                                "custom_classes": [],
                                "text_filters": [],
                                "pii_entities": [],
                                "start_timestamp": 18.02,
                                "end_timestamp": 27.819999999999997,
                                "start_char_index": 283,
                                "end_char_index": 430
                            }
                        ],
                        "words": [
                            {
                                "time": 1.52,
                                "alternatives": [
                                    {
                                        "text": "And",
                                        "score": 0.23571264642017406
                                    }
                                ],
                                "type": "pronunciation",
                                "meta": {}
                            },
                            {
                                "time": 27.82,
                                "alternatives": [
                                    {
                                        "text": ".",
                                        "score": 0.7616554849091626
                                    }
                                ],
                                "type": "punctuation",
                                "meta": {}
                            }
                        ]
                    }
                ]
            }
        }
    ],
    "from_cache": false
}
NameDescription
transcriptTranscript of entire video or audio clip at once.
words[j].timeTimestamp in seconds for each predicted word or punctuation in the transcript.
words[j].typepronunciation: If the predicted character string is a word.
punctuation: If the predicted character string is a punctuation.
words[j].alternatives[k].textPredicted character string at that timestamp.
words[j].alternatives[k].scoresConfidence score for the predicted character string.
alternativesList of alternative word predictions at each timestamp.
classifications[i].classesList of dictionaries of all output classes. Each dictionary contains the class name and the score. The scores range from 0 to 3 with 3 being the most severe.
classifications[i].classes.className of predicted class.
classifications[i].classes.scoreScore of predicted class.
classifications[i].start_char_indexFirst character processed.
classifications[i].end_char_indexLast character processed.