Image Generation Models
A guide to our image generation models
Note: The information on this page applies to customers on our deprecated v2 APIs for image generation. Customers on our v3 APIs should refer to the updated v3 docs for image generation.
Overview
Our image generation models create images based on text prompts. A maximum of six images can be generated per prompt. When a user makes an API request with this model, the response consists of links to each generated image and information about its dimensions.
Before generating any images, we first run the prompt through our text moderation model. If the prompt is flagged, the model does not proceed with image generation and an error message will be returned. If the prompt is not flagged, we run the subsequent generated images through our visual moderation model in order to prevent violent, sexually explicit, or otherwise harmful results. Images will not be returned if they are flagged for any of the following moderation categories: NSFW, nudity, and blood. If any of the generated images for a given prompt are flagged during this moderation process, the API call will fail and an error message will be returned in place of a normal response.
Models
We have four different image generation models available, with additional models to be served in the near future. Our current lineup consists of: SDXL (Stable Diffusion XL), SDXL Enhanced, Flux Schnell, and Flux Schnell Enhanced. SDXL Enhanced and Flux Schnell Enhanced are Hive’s enhanced versions of the aforementioned base models, served exclusively to our customers.
Here are the differences between our current model offerings:
Model | Description |
---|---|
SDXL (Stable Diffusion XL) | Latent diffusion text-to-image generation model produced by Stability AI. Trained on a larger dataset than the base model, with a larger UNet enabling better generation. |
SDXL Enhanced | Hive’s enhanced version of SDXL, served exclusively to our customers. Tailored toward a photorealistic and refined art style with extreme detail. |
Flux Schnell | Flux’s fastest model in their suite of text-to-image models, capable of generating images in 4 or fewer steps. Best suited for local development and personal use. |
Flux Schnell Enhanced | Hive’s enhanced version of Flux Schnell that is trained on our proprietary data and retains the base model’s speed and efficiency, served exclusively to our customers. Generates images across a wide range of artistic styles with a specialization in photorealism, leading to high levels of customer satisfaction based on past user studies. |
Request Format
Below are the input fields for an image generation cURL request. The asterisk (*) next to an input field designates that it is required.
text_data
*: The text prompt the model uses for image generation.
neg_text
: Text prompt where the user can detail aspects that should not be included in the generated image.
num_images
: The number of images to generate per request. The default value is 2, with a range of 1 to 6, inclusive.
callback_url
: When the task is completed, we will send a callback from our servers to this callback url.
Here is an example of a cURL request using the following format:
curl --location --request POST 'https://api.thehive.ai/api/v2/task/async' \
--header 'authorization: Token <YOUR_TOKEN>' \
--header 'Content-Type: application/json' \
--data-raw '{
"options": {
"neg_text": "grass, pool",
"num_images": 3
},
"text_data": "modern architecture house",
"callback_url": "example_url"
}'
Response
After making an image generation model cURL request, you will receive a response consisting of links to the resulting generated images. To see an example API response for this model, you can visit our API reference page.
Content Moderation
Content moderation is enabled by default on all of our image generation models. When a request is made to our image generation models, it gets passed through two moderation filters. First, the prompt is run through our text moderation model. If the prompt gets flagged, image generation will not occur. If the prompt does not get flagged, the resulting generated images are run through our visual classification model.
If the image generated does not meet our guidelines for text or visual moderation, the task will not be charged, and the following message will be returned.
{ "return_code": 451, "message": "Images did not pass moderation filters.” }
Updated 14 days ago