Contextual Search (Text Query)


Contextual Search (Text Query) is a highly customizable tool to search image libraries with text queries, built on a powerful multimodal model that maps between natural language and visual features. The Contextual Search API allows you to build and update a large private image database that is unique to your project, and then search this custom set via text inputs. In response, the API returns a list of any reference images that are well-described by the text.

Rather than training on specific phrases and image classes, our image-text similarity model applies a generalized understanding of semantic concepts to relate those concepts to visual subject matter in an image. Combined with the flexibility of a custom search index, the API enables straightforward searches for specific images or content types across unstructured public or private image sets, image classification and tagging, and more.

Contextual Search API

The Contextual Search API uses three different endpoints to manage the search index associated with your project and return model results:

  • "Add" endpoint to add images to your search index
  • "Query" endpoint to search your index for visual matches to a text query
  • "Remove" endpoint to remove images you no longer need to search against from your index

Adding an image to your search index

Submitting a request to the "Add" endpoint point will index the specified image to be checked against in future searches. Note: requests to this endpoint must specify a public or signed URL for the image or a file path for direct upload.

Optionally, you can also include metadata to be stored alongside the image. This could be an ID for the image, a user ID associated with the image, descriptions and tags, or anything else that would be useful to receive back in the API response along with the image itself.

Querying your search index

To access model-based search results, submit a request to the "Query" endpoint with search terms in a plain text string as form data. Our model will encode the text and compare to features of each image in your search index to identify semantic matches.


The Contextual Search API then returns a JSON object listing any images that match the query text. Each match will be described with the following response fields:

  • id : unique identifier for the matching image created and returned when it was added to the search index
  • similarity_score : A value between 0 and 1 that quantifies the predicted correlation between natural language concepts in the text query and visual content in the matching image. Higher values indicate a closer match.
  • metadata : any metadata provided with the matching image when it was added to the search index

Removing an image from your search index

If you no longer need to search against an image (e.g., the image is removed from your platform), you can remove it from your search index by calling the "Remove" endpoint. Like the "Add" endpoint, you'll need to provide a URL for the image you want to remove or a file path to the image itself.

Note: For remove requests, our backend searches your index for the specified image using SHA256 hash values. Therefore, the file you submit should be identical to the image file you want to remove.