Image and Video Detection

Hive's AI-Generated Image and Video Detection comprises two different APIs: one for detecting images generated by an AI engines such as Midjourney, DALL-E, or Firefly and one for detecting deepfakes, or images in which AI has been used to map one person's face onto another's. The information for both models is located on this page, though the kinds of content they detect are different — one identifies the presence of AI-generated imagery more broadly, while the other flags faces that have been swapped or otherwise altered.

AI-Generated Image and Video Detection

Hive's AI-Generated Image and Video Detection API takes an input image and determines whether or not the input is entirely AI-generated. The model was trained on a large dataset comprising millions of artificially generated images and human-created images such as photographs, digital and traditional art, and memes sourced from across the web. Our response returns not only whether or not an image was classified as AI-generated, but also which image synthesis model created it. Confidence scores are provided for each classification for easy interpretation of results. This API helps customers protect themselves from the potential misuse of AI-generated and synthetic content. For example, it can flag and remove AI-generated content on Internet social platforms, or prevent fraud in the insurance claims process by identifying evidence with AI-generated augmentations.

Response

The AI-Generated Image and Video Detection API has two heads:

Generation classification: ai_generated, not_ai_generated
Source classification: dalle, midjourney, stablediffusion, gan, bingimagecreator, adobefirefly, kandinsky, stablediffusionxl, stablediffusioninpaint, sdxlinpaint, lcm, pixart, glide, imagen, amused, stablecascade, deepfloyd, vqdiffusion, wuerstchen, titan, sora, pika, harper, ideogram, kling, luma, hedra, flux, hailuo, mochi, other_image_generators (image generator other than those that have been listed), inconclusive, inconclusive_video (no video source identified), or none (media is not AI-generated)

The confidence scores for each model head sum to 1.

The first head gives a binary classification for all images, identifying whether or not they were AI generated and the accompanying confidence score. The second head provides further details as to the image's source, with support for the most popular AI art generators currently in use. If the model cannot identify a source, it will return none under the source head.

To see an annotated example of an API response object for this model, you can visit our API reference page.

👍

More media sources coming soon!

As AI-generated artwork continues its rapid growth in popularity, we will update our model to incorporate emerging artwork generation models. To learn more or request additional sources be added, please contact [email protected].

Supported File Types

Image Formats:

jpg/jpeg
webp
png

Video Formats:
mp4
webm
avi
mkv
wmv
mov

Deepfake Detection

Hive’s Deepfake Detection API identifies whether or not an image or video query is a deepfake. This model uses the same underlying technology as our Demographic API to locate faces within queries. It then performs a classification step on each face to determine whether or not those representations are deepfakes. The API response provides a confidence score for each classification.

Deepfakes, or videos in which deep learning is used to map one person’s appearance onto another’s, first gained media attention in 2017. Since then, they have grown in popularity — which in turn has inspired new ways of making them that are both more convincing and more accessible to those without experience in machine learning. This kind of realistic synthetic video content has enabled the creation of fake digital identities, political misinformation, and, most commonly, nonconsensual pornography. Identifying and removing them across online platforms is crucial to limit not only the significant harm they can cause to those who appear in them but also the misinformation, fraud, and digital sexual assault that they enable.

Our Model

Deepfake Detection utilizes a visual detection model to locate and classify faces in any given query. Visual detection models localize an object of interest in an image by returning a box that bounds that object, as well as the type, or class, of that object. For each detection, a detector outputs a classification and confidence score that are independent of any other detections.

After an image or video is submitted to our Deepfake Detection API, Hive’s backend splits the any video content into frames and runs the model on each frame (an image input is treated as a video with a single frame). After passing through this visual detection model, any faces in the query are passed through an additional classification step to identify whether or not they are deepfakes.

A separate classification is made for each detected face. This kind of approach can differentiate real people and synthetic ones by detecting and classifying each face separately, giving more information as to which part of a given input is manipulated.

Response

The output object in Deepfake Detection API lists each detected face, including:

  • The geometric description of the detected bounding box.
  • The predicted class for the detection.
  • The confidence score for the detection.

To see a full example of an API response object for this model, you can visit the API reference page.

Supported File Types

Image Formats:
gif
jpg
png
webp

Video Formats:
mp4
webm
avi
mkv
wmv
mov