Hive’s Text Generation API through Auto ML Large Language Model produces text in response to a given text prompt. The response to each prompt can be up to 4096 tokens (about 16k characters). The model produces writing across many genres and formats for a wide variety of use cases, including answering questions, writing stories, participating in conversations, and programming in multiple programming languages.

A sample JSON response is shown below:

{
  "text_data": "Cossacks came from what backgrounds?", // Current prompt
  "options": {
    "system_prompt": "Please provide formal responses", //Optional
    "roles": { //Optional
      "user": "Bob",
      "model": "HPT"
    },
    "prompt_history": 
      {
        "content": "this is the first question", 
        "role": "HPT" 
      },
      {
        "content": "this is the first answer",
        "role": "Bob" // the role here must be one of the strings specified in "roles" obj
      },
      // ...
      {
        "content": "this is the newest question",
        "role": "HPT"
      },
      {
        "content": "this is the newest answer",
        "role": "Bob"
      }
    ],
    "max_tokens": 125, // optional input 
    "temperature": 0.8, // optional input 
    "top_p": 0.95 // optional input 
  }
}

Option input definition:

  • System_prompt: This is a string value. This will provide context to how the model should respond to all requests. For example "You are a helpful assistant to...".
  • Roles:
    • User: This is a string value. This will be the name of how you want the customer acts as.
    • Model: This is a string value. This will be the name of how you want the model acts as.
  • Prompt_history: This is the prompt_history that you can optionally pass in for the certain inference to the model. It will be in the order from the oldest to the newest.
  • max_tokens: This is an integer value between 0 to 2048. This is the max token window that you want for your model completion output. We will default to 125 tokens.
  • temperature: This is an integer value between 0 and 1. When 0, the same request should yield the same response; when 1, the same request should yield different responses. We will default to 0.8.
  • top_p: This is an integer value between 0 and 1. When 0, requests should be more diverse than when set to a higher value. We will default to 0.95.

For instructions on how to submit a task via API, either synchronously or asynchronously, see our API Reference documentation.