Concepts

Brief descriptions for each of the main components of our AutoML platform

Datasets

Datasets represent the information that you would like to use to train an AutoML model or create an embedding. They consist of a set of files that are uploaded to the Datasets section of our AutoML platform. Once created, a dataset can be edited or deleted at any time. For more information about datasets, please see our Datasets page.

Snapshots

A snapshot is a point-in-time export of a dataset which can be used to train models or create embeddings. Creating a snapshot automatically validates your data to ensure it is suitable for model training. For more information about snapshots, please see our Snapshots page.

Models

A model is a machine learning model that has been fine-tuned and evaluated using the training and test datasets you provided. Models are trained to perform a specific task, such as categorizing an image input into one of several classes. Our AutoML platform currently supports models for image classification, text classification, and text generation. For more information about training models and which model type best suits your needs, please see our Models page.

Deployments

A deployment is a model that has been loaded to support inference requests and is accessible via an API endpoint. When a model is deployed, we create a Hive Data project for it so that it can be used in the same way as all Hive pre-trained models. For more information about deployments, please see our Deployments page.

Embeddings

An embedding is a vectorized representation of your dataset that can be queried for similar information to the provided input. They can queried directly or used to augment large language models with additional information in order to make their responses more factually accurate. For more information about embeddings, please see our Embeddings page.


What’s Next