AI Dictionary
Training Data
Definition
The initial dataset used to teach a machine learning application to recognize patterns or perform a specific task.
Deep Dive
Training data refers to the initial, often extensive, dataset used to teach a machine learning model to learn patterns, make predictions, or perform a specific task. This data typically consists of input examples paired with their corresponding correct outputs or labels (in supervised learning), or simply raw inputs for pattern discovery (in unsupervised learning). The model iteratively adjusts its internal parameters and weights by processing this data, aiming to minimize prediction errors or optimize a defined objective function.
Examples & Use Cases
- 1A dataset of thousands of labeled images of cats and dogs used to teach an image classifier.
- 2Historical sales data including product features and purchase outcomes used to train a recommendation engine.
- 3A collection of customer support tickets paired with their correct classification categories (e.g., "billing," "technical support") to train a text classifier.
Related Terms
Test SetValidation SetSupervised Learning