People Counting


Hive's People Counting API determines the number of different people visible within an image or video. The model classifies each input image or video using six different categories, with each category corresponding to the number of unique people present in that input: 0, 1, 2, 3, 4, or 5+. All inputs that have 5 or more people in them will be placed into the "5+" category.

This API is powered purely by a classification model. It does not count using face detection, which opens up the potential to miss people whose faces are not visible. If the same person is shown multiple times — such as in a collage of images or an image containing someone's reflection — the model correctly categorizes them as one unique person and only counts them once regardless of how many times they appear.

This API accepts both images and videos. When submitting a video to be processed, Hive’s backend splits the video into frames and runs the model on each frame. In the coming months, we plan to add the ability to recombine the results into an aggregated response for the entire video as well.


The JSON response includes one classification per frame (an image input is treated as a video with one frame). The model has six classes, or possible classifications:

  • 0 (no people are present in the frame)
  • 1 (one person is present in the frame)
  • 2 (two people are present in the frame)
  • 3 (three people are present in the frame)
  • 4 (four people are present in the frame)
  • 5+ (five or more people are present in the frame)

To see an annotated example of an API response object for this model, you can visit our API reference page.

Supported File Types

Image Formats:

Video Formats: