How to Segment Customers Using Clustering
Why ML Segmentation Beats Manual Segments
Manual segmentation divides customers by obvious categories: geographic region, plan tier, or signup date. These segments are easy to create but often do not reflect how customers actually behave. Two customers in the same region on the same plan might have completely different purchasing patterns and needs.
Clustering discovers segments based on behavioral patterns across all your data dimensions at once. It might find that your real customer segments are "high-value loyalists who buy monthly," "bargain hunters who only buy during sales," "new customers who are still exploring," and "declining customers who used to be active." These behavior-based segments are far more actionable than demographic ones.
Choosing Your Features
The features you include determine what kind of segments the algorithm finds. Choose features that reflect the behaviors you want to differentiate:
RFM Features (Classic Starting Point)
- Recency: Days since last purchase or interaction
- Frequency: Number of purchases or interactions in a time period
- Monetary: Total spending or average order value
RFM clustering is the most common starting point because these three metrics capture the essence of customer value. Even this simple combination usually reveals 4-6 meaningful segments.
Extended Behavioral Features
- Product categories purchased (breadth of catalog engagement)
- Channel preference (web, mobile, email, SMS response rates)
- Support ticket frequency and type
- Feature usage patterns (for SaaS)
- Return rate and complaint frequency
- Campaign response rate
- Time-of-day purchasing patterns
More features give the algorithm more dimensions to find patterns in, but too many features on too few customers can produce unstable segments. Aim for 5-10 well-chosen features for most segmentation projects. See How to Prepare Your Data.
Running the Segmentation
Pull one row per customer with your chosen features. Normalize the data so that all features are on comparable scales. A spending column in dollars (0-10,000) would dominate a frequency column (0-50) if not scaled. Most clustering implementations handle this automatically, but check your results to make sure one feature is not overwhelming the others.
Upload your CSV to the Data Aggregator app. Select k-means as the algorithm. There is no target column for clustering because the algorithm discovers the groups itself.
Start with k=4 or k=5. Too few clusters (2-3) oversimplifies and misses meaningful distinctions. Too many clusters (10+) creates groups too small to act on differently. Run the algorithm with k=3, 4, 5, 6, and 7, then compare which number produces the most useful and interpretable groups.
For each cluster, examine the average feature values. What makes this group different from the others? Name each segment based on its defining characteristic. "Power Buyers" (high frequency, high monetary), "Window Shoppers" (high recency visits, low monetary), "Lapsed VIPs" (historically high spend but low recent activity). Clear names make segments actionable for your team.
Common Segment Patterns
While every business is different, clustering frequently reveals some combination of these archetypes:
- Champions: Recent, frequent, high-spending customers. Your best segment. Reward them, ask for referrals, and give them early access to new products.
- Loyal regulars: Consistent purchasers with moderate spending. Reliable revenue. Upsell to increase their average order value.
- New and promising: Recent first-time buyers who resemble champions in their early behavior. Invest in onboarding and first-purchase follow-up to develop them into champions.
- At risk: Previously active customers whose recency is increasing (they are buying less often). Trigger a re-engagement campaign before they lapse completely.
- Price sensitive: Customers who buy primarily during sales and promotions. Profitable only if your margins on discounted items justify the volume. Target with clearance and promotion campaigns.
- Dormant: No recent activity, low engagement. Test a win-back offer on a sample before investing in the full segment.
Keeping Segments Current
Customer behavior changes over time. A champion from last year might be at risk today. Re-run your clustering periodically (monthly or quarterly) to update segment assignments. You can also use the trained clustering model to assign new customers to segments as soon as you have enough behavioral data on them.
Discover your real customer segments with machine learning clustering. No coding or manual rule-writing required.
Get Started Free