What Is the Difference Between Training and Predicting
Training: Teaching the Model
Training is the learning phase. You give the model a dataset where the outcomes are already known. For example, a spreadsheet of past customers with columns for their behavior metrics and a final column showing whether they cancelled or stayed. The algorithm examines this data, finds the relationships between the input columns and the outcome, and builds an internal mathematical model of those relationships.
Think of training like studying for an exam with an answer key. The model sees both the questions (customer features) and the correct answers (churned or not). It figures out which features matter most and how they combine to predict the outcome.
What Happens During Training
- The algorithm reads your entire dataset row by row
- It identifies which columns (features) have the strongest relationship to the outcome
- It builds a mathematical model that captures those relationships
- It tests its own accuracy by predicting outcomes for data it already knows the answer to
- The final trained model is saved and ready to use
Training time depends on your data size and the algorithm you choose. A decision tree on 1,000 rows might train in seconds. A gradient boosting model on 100,000 rows might take a few minutes. On this platform, you can monitor training progress and the system reports accuracy metrics when training finishes.
Predicting: Using the Model
Predicting is the application phase. You send the trained model a new data point, one that was not in the training data, and it returns a prediction based on what it learned. The model applies the same patterns it found during training to this new input.
Continuing the exam analogy, predicting is taking the real exam. The model has studied, and now it gets a new question (a new customer record) and answers based on what it learned (this customer has a 73% chance of churning).
What Happens During Prediction
- You send one or more new data records to the model
- The model checks the same features it learned from during training
- It applies its internal mathematical model to calculate the result
- It returns a prediction: a category label, a probability, a number, or a cluster assignment
- The prediction typically takes milliseconds
Why the Cost Difference Matters
Training is computationally intensive. The algorithm has to process your entire dataset, sometimes making multiple passes over it. This is why training costs credits on this platform, proportional to data size and algorithm complexity.
Predicting is fast and cheap. The trained model is just a set of rules and weights stored in memory. Running a new data point through those rules takes almost no computation. This is why predictions on this platform cost zero credits per request after training.
When to Retrain
A trained model reflects the patterns in the data it learned from. If your business changes, your customer behavior shifts, or the world around you evolves, the model's predictions may become less accurate over time. This is called model drift.
Common reasons to retrain:
- You have significantly more data than when you first trained
- Your business has changed (new products, new pricing, new customer segments)
- Prediction accuracy has dropped based on your monitoring
- External conditions have shifted (market changes, seasonal patterns)
Retraining is just running the training process again with updated data. The old model is replaced with a new one that reflects current patterns. See How to Retrain Models With New Data for a detailed walkthrough. For models that can update without full retraining, see What Is Incremental Training.
A Simple Example
Suppose you run an online store and want to predict which customers will make a repeat purchase within 30 days.
Training: You export a spreadsheet of 5,000 past customers with columns like first order value, number of items, product category, days since signup, and whether they bought again within 30 days. You upload this to the Data Aggregator, pick a random forest classifier, and train. The model learns that customers who spent over $75, bought from certain categories, and signed up within the last 90 days are most likely to return.
Predicting: Every time a new customer makes their first purchase, you send their data to the trained model. It returns a probability like "82% likely to buy again." Your workflow can automatically send a thank-you email to high-probability customers or a discount code to low-probability ones. Each prediction costs nothing because the model is already trained.
Train ML models on your data and get unlimited free predictions. No coding required.
Get Started Free