Visual Detection Overview
Visual detection models localize an object of interest in an image by returning a box that bounds that object, as well as the type of that object, also referred to as the class. A detector can detect multiple objects of different classes per image. For each detection, a detector outputs a confidence score that is independent of any other detections.
The output object in Hive detection APIs lists each detected object, including:
- The geometric description of the detected bounding box.
- The predicted class for the detection.
- For some models, the confidence score for the detection.
When submitting a video to be processed, Hive’s backend splits the video into frames, runs the model on each frame, then recombines the results into a single response for the entire video. The video output for a detector is similar to a list of detection output objects, but with multiple timestamps.
Hive’s face detection model achieves state-of-the-art accuracy. Face detections are passed through an additional classification step to predict attributes like gender, age, or "liveness".
The gender classification model runs on-top of the face detections provided by the face detection model, and classifies faces as:
- other_gender: non-human face / indistinguishable gender
Hive detects perceived gender based on the physical appearance of a face in the given context. These detections are not predictions of gender identity and this model is not designed to be used as such.
The age classification model runs on-top of the face detections provided by the face detection model, and classifies faces as:
senior: 65 yrs old and above
middle_aged: 45-64 yrs old
adult: 18-44 yrs old
teenager: 13-17 yrs old
pre_teen: 5-12 yrs old, any kid older than a toddler but younger than a teen
toddler: 2-4 yrs old, starting to be able to walk / crawl
baby: 0-1 yrs old, recently born, unlikely to know how to walk / crawl
other_age: non-human face / indistinguishable age
Note: All of the above ranges are inclusive. In the range 13-17, for example, all ages 13.0-17.99 are included.
Age Regression (Beta)
This class returns a number representing the predicted exact age of the subject.
Liveness Classification (Beta)
The "liveness" classification model run on top of the face detection model and classifies faces as:
- primary: this face exists in the primary image
- secondary: this face exist in a image, screen, or painting inside of the primary image (ex: A face in a picture frame hung on the wall)
Setting Thresholds (Demographics)
For Age and Gender classes returned in our demographics response, one should assume the predicted class to be the class with the max score of each of the classes returned in each classifier. For Liveness, any score >.95 "primary" should be assumed to be primary with any score < .95 assumed to be "secondary".
In a separate offering from the "Demographics" Face Detection endpoint, Hive also offers a face similarity model. In this model, one submits a reference and target image in one API call. Hive then returns a "similarity score" that is correlated to how similar the reference face is to any of the faces in the target image.
Supported File Types
Updated 4 months ago
See the API reference for more details on the API interface and response format.