How to Detect Fraud With Anomaly Detection
Why Anomaly Detection Works Better Than Rules
Traditional fraud prevention uses hand-written rules: flag transactions over $1,000, block purchases from certain countries, require verification for new accounts. These rules catch known fraud patterns but miss anything creative. Fraudsters adapt quickly, and every new rule you write is a reaction to a technique they have already moved past.
Anomaly detection takes a fundamentally different approach. Instead of defining what fraud looks like, you define what normal looks like. The model learns the full range of legitimate transaction patterns from your historical data. Anything that falls outside that learned normal range gets flagged, regardless of what kind of fraud it is. This catches both known and unknown fraud techniques.
What Data to Train On
Train your anomaly detection model on legitimate transactions only. The model needs to learn what normal looks like, so feed it a clean dataset of transactions you know are good. Include features that describe the transaction in multiple dimensions:
- Transaction details: Amount, currency, payment method, product category
- Timing: Hour of day, day of week, time since last transaction from this account
- Velocity: Number of transactions in the last hour, last 24 hours, last 7 days from this account
- Geographic: Country, distance from usual location, number of distinct locations in last 24 hours
- Device and session: Device type, browser, IP address reputation score, session duration
- Account: Account age, lifetime transaction count, average order value, typical purchase time
The more dimensions you include, the more types of anomalies the model can detect. A transaction might look normal on any single metric but be anomalous when you consider all metrics together. See How to Prepare Your Data for data preparation guidance.
Building the Fraud Detection Model
Pull at least 1,000 legitimate transactions (the more the better) with the features listed above. Remove any known fraudulent transactions from this dataset. You want the model to learn only what normal looks like.
Upload the CSV to the Data Aggregator app. Choose isolation forest as the algorithm. It is the best general-purpose anomaly detector and handles high-dimensional transaction data well. There is no target column for anomaly detection because you are training on normal data only.
The contamination parameter tells the model what percentage of your data might be anomalous. For fraud detection, start with 1-5% depending on your expected fraud rate. A lower contamination means fewer false positives but might miss some fraud. A higher contamination catches more fraud but generates more alerts to investigate.
Train the model and then test it against a held-out set of transactions that includes both legitimate and known fraudulent ones. Check how many of the known fraudulent transactions the model correctly flags as anomalies (recall) and how many legitimate transactions it incorrectly flags (false positive rate).
Send each new transaction through the model before processing it. The model returns an anomaly score. Transactions above your threshold get flagged for manual review or automatic blocking. Because scoring costs zero credits, there is no cost tradeoff to checking every transaction.
Handling Flagged Transactions
Not every flagged transaction is actual fraud. The model flags anything unusual, which could include legitimate customers making unusually large purchases, buying gifts at unexpected times, or shopping from a new location. Design your response based on risk level:
- Low anomaly score: Process normally but log for batch review
- Medium anomaly score: Process but send to a review queue for human verification within 24 hours
- High anomaly score: Hold the transaction and require additional verification (email confirmation, phone call, or additional authentication)
- Extreme anomaly score: Block automatically and alert your fraud team
Integrate this scoring into your workflow automation to route flagged transactions automatically. A chain command can score the transaction, check the anomaly level, and trigger the appropriate response without manual intervention.
Beyond Payment Fraud
The same anomaly detection approach works for other types of fraud and abuse:
- Account takeover: Detect when an account's login behavior suddenly changes (new device, new location, unusual time)
- Fake signups: Identify bot-generated accounts by their registration patterns (speed, email format, missing profile fields)
- Promo abuse: Flag customers creating multiple accounts to exploit promotional offers
- Return fraud: Detect unusual return patterns that differ from legitimate customer behavior
Catch fraud before it costs you money. Train an anomaly detection model on your transaction data and score every transaction for free.
Get Started Free