Common Object Detection
Visual Detection Overview
Visual detection models localize an object of interest in an image by returning a box that bounds that object, as well as the type of that object, also referred to as the class. A detector can detect multiple objects of different classes per image. For each detection, a detector outputs a confidence score that is independent of any other detections.
The output object in Hive detection APIs lists each detected object, including:
- The geometric description of the detected bounding box.
- The predicted class for the detection.
- For some model’s, the confidence score for the detection.
When submitting a video to be processed, Hive’s backend splits the video into frames, runs the model on each frame, then recombines the results into a combined response for the entire video. The video output for a detector is similar to a list of detection output objects, but with multiple timestamps.
Classes
- wine glass
- bottle
- baseball glove
- baseball bat
- banana
- backpack
- apple
- train
- vase
- umbrella
- tv
- truck
- traffic light
- toothbrush
- toilet
- tie
- tennis racket
- teddy bear
- surfboard
- suitcase
- stop sign
- spoon
- skis
- skateboard
- sink
- remote
- sheep
- scissors
- refrigerator
- potted plant
- pizza
- person
- parking meter
- oven
- mouse
- motorcycle
- microwave
- laptop
- knife
- kite
- keyboard
- hot dog
- horse
- handbag
- hair drier
- frisbee
- fork
- fire hydrant
- donut
- elephant
- dog
- dining table
- cup
- cow
- couch
- clock
- chair
- cell phone
- cat
- carrot
- car
- cake
- bus
- broccoli
- bowl
- book
- boat
- bird
- bicycle
- bench
- bed
- airplane
- bear
- giraffe
- orange
- sandwich
- snowboard
- toaster
- zebra
- sports ball
Supported File Types
Image Formats:
gif
jpg
png
webp
Video Formats:
mp4
webm
avi
mkv
wmv
mov
Updated 12 months ago
See the API reference for more details on the API interface and response format.