Visual Moderation

Visual Classification Overview

Visual classification models classify an entire image into different categories by assigning a confidence score for each class.

Classification models can be multi-headed, where each group of mutually exclusive model classes belong to a single model head. For example, when an image is run through Hive's visual moderation model, one head might classify sexually not-safe-for-work (NSFW) content while another head might classify the presence of guns.

This concept is illustrated below. This imaginary model has two heads:

NSFW classification: general_nsfw, general_suggestive, general_not_nsfw_not_suggestive

Gun classification: gun_in_hand, animated_gun, gun_not_in_hand, no_gun

The confidence scores for each model head sum to 1.

When submitting a video to be processed, Hive’s backend splits the video into frames, runs the model on each frame, then recombines the results into an aggregated response for the entire video. The video output for a classifier is similar to a list of classification output objects, but with multiple timestamps.

A more detailed walkthrough on how to submit visual classification tasks via the API and how to interpret the visual model response can be found in our customer guide.

Visual Content Moderation

Hive's visual classification models support a wide variety of classes that are relevant to content moderation. Broadly, visual moderation classes can be separated into five main categories: sexual content, violent imagery, drugs, hate imagery, and image attributes. When deciding how to process our API response in order to implement your content policy, you should consult the following class descriptions to decide which classes to moderate.

Note: Older versions of the API might not perfectly match the outline below. Please reach out to [email protected] if you would like to access the latest content moderation classes.

Sexual

NSFW Head:

  • general_nsfw - genitalia, sexual activity, nudity, buttocks, sex toys, animal genitalia
  • general_suggestive - shirtless men, underwear / swimwear, sexually suggestive poses without genitalia, occluded or blurred sexual activity
  • general_not_nsfw_not_suggestive - none of the above, clean

Sexual Activity Head:

  • yes_sexual_activity - a sex act or stimulation of genitals are present in the scene
  • no_sexual_activity - no sex act is present in the scene

Realistic NSFW Head:

  • yes_realistic_nsfw - live nudity, sex acts, or photo-realistic representations of nudity or sex acts
  • no_realistic_nsfw - non-photorealistic representations of nudity or sex acts (statues, crude drawings, paintings etc.); lack of any NSFW content

Female Underwear Head:

  • yes_female_underwear - lingerie, bras, panties
  • no_female_underwear

Male Underwear Head:

  • yes_male_underwear - fruit-of-the-loom, boxers
  • no_male_underwear

Sex Toy Head:

  • yes_sex_toy - dildos, certain lingerie
  • no_sex_toy

Female Nudity Head:

  • yes_female_nudity - breasts or female genitalia
  • no_female_nudity

Male Nudity Head:

  • yes_male_nudity - male genitalia
  • no_male_nudity

Female Swimwear Head:

  • yes_female_swimwear - bikinis, one-pieces, not underwear
  • no_female_swimwear

Shirtless Male Head:

  • yes_male_shirtless - shirtless below mid-chest
  • no_male_shirtless

Sexual Intent Head: (beta)

  • yes_sexual_intent - occluded, blurred, or hidden sexual activity
  • no_sexual_intent

Animal Genitalia Head: (beta)

  • animal_genitalia_and_human - sexual activity including both animals and humans
  • animal_genitalia_only - animals mating and pictures of animal genitalia
  • animated_animal_genitalia - drawings of sexual activity involving animals
  • no_animal_genitalia - none of the above, clean

Violence

Gun Head:

  • gun_in_hand - person holding rifle, handgun
  • gun_not_in_hand - rifle, handgun, not in hand
  • animated_gun - gun in games, cartoons, etc. can be in-hand or not.
  • no_gun

Knife Head:

  • knife_in_hand - person holding knife, sword, machete, razor blade
  • knife_not_in_hand - knife, sword, machete, razor blade, not in hand
  • culinary_knife_in_hand - knife being used for preparing food
  • no_knife

Blood Head:

  • very_bloody - gore, visible bleeding, self-cutting
  • a_little_bloody - fresh cuts / scrapes, light bleeding
  • no_blood - minor scabs, scars, acne, etc. are not considered ‘blood’ by model
  • other_blood - animated blood, fake blood, animal blood such as game dressing

Hanging Head:

  • hanging - the presence of a human hanging by noose (dead or alive)
  • noose - a noose is present in the image with no human hanging from it
  • no_hanging_no_noose - no person hanging and no noose present

Corpses Head: (beta)

  • human_corpse: human dead body present in image
  • animated_corpse: animated dead body present in image
  • no_corpse

Emaciated Bodies Head:

  • yes_emaciated_body: emaciated human or animal body present in image
  • no_emaciated_body

Self Harm Head: (beta)

  • yes_self_harm: self cutting, burning, instances of suicide or other self harm methods present in image
  • no_self_harm

Drugs

Pill Head:

  • yes_pills - pills and / or drug powders
  • no_pills - no pills and / or drug powders

Injectable Head:

  • illicit_injectables - heroin and other illegal injectables
  • medical_injectables - injectables for medical use
  • no_injectables - no injectable drug paraphernalia

Smoking Head:

  • yes_smoking - cigarettes, cigars, marijuana, vapes, or other smoking paraphernalia
  • no_smoking - no cigars, marijuana, vapes, or other smoking paraphernalia

Hate

Nazi Head:

  • yes_nazi - Nazi symbols
  • no_nazi - absence of the above

Terrorist Head:

  • yes_terrorist - ISIS flag
  • no_terrorist - absence of the above

White Supremacy Head:

  • yes_kkk - KKK symbols
  • no_kkk - absence of the above

Middle Finger Head:

  • yes_middle_finger - middle finger
  • no_middle_finger - absence of the above

Other Attributes

Text Head:

  • text - any form of text or writing is present somewhere on the image
  • no_text - no text present in the image

Overlay Text Head:

  • yes_overlay_text - digitally overlaid text is present on an image (think meme text)
  • no_overlay_text - lack of digitally overlaid text in the image

Child Presence:

  • yes_child_present: a baby or toddler is present in the image
  • no_child_present

Drawings: (beta)

  • yes_drawing: a drawing, painting, or sketch is the central part of the image
  • no_drawing

Image Type Head:

  • animated - the image is animated
  • hybrid - the image is partially animated
  • natural - the image has no animation

📘

NOTE:

If you need more information when deciding which classes to use, a comprehensive list of subject matter covered by each visual class is available in this visual taxonomy document (warning: somewhat NSFW).

Brand Safety & Suitability - GARM taxonomy

Hive's Brand Safety and Brand Suitability APIs are powered by Hive's visual moderation model and are additionally mapped to the GARM Brand Safety & Suitability Framework (Global Alliance for Responsible Media), which was established as an industry-standard for categorizing harmful content. For more information click here for more information.

Choosing Thresholds for Visual Moderation

For each of the classes mentioned above, you will need to set thresholds to decide when to take action based on our model results. For optimum results, a proper threshold analysis on a natural distribution of your data is recommended (for more on this please contact Hive at the email below). Generally, though, a model confidence score threshold of >.90 is a good place to start to flag an image for any class of interest.

For questions on best practices, please message your point of contact at Hive or send a message to [email protected] to contact our API team directly.

Supported File Types

Image Formats:
gif
jpg
png
webp

Video Formats:
mp4
webm
avi
flv
mkv
mpg
wmv
mov