Speech Generation


Our Speech Generation API creates audio clips based on text prompts. These text prompts have a character limit of 4096, and the model produces one clip per prompt. The clip consists of a human-like voice reading the text in the prompt.

Request Format

The input fields for this API are as follows:

text_data: The text that will be turned into speech.
callback_url: If the task is completed, we will send a callback from our servers to this callback url.

In full, a request to our Speech Generation API follows the following format:

curl --request POST \
  --url https://api.thehive.ai/api/v1/task/async \
  --header 'Content-Type: application/json' \
  --header 'authorization: Token <API_KEY>' \
  --data '{"text_data": "this is a test", "callback_url": <YOUR_URL>}'


The output of our Speech Generation API consists of a link to the generated audio clip. The audio clip is in wav format. To see an annotated example of an API response object for this model, you can visit our API Reference.