Home » No-Code Machine Learning » Fraud Detection

How to Detect Fraud With Anomaly Detection

Anomaly detection catches fraudulent transactions by learning what legitimate activity looks like and flagging anything that deviates significantly from that pattern. Unlike rule-based fraud checks that only catch known patterns, anomaly detection catches novel fraud techniques because it identifies anything unusual, even fraud types it has never seen before. Because predictions cost zero credits after training, you can check every single transaction in real time.

Why Anomaly Detection Works Better Than Rules

Traditional fraud prevention uses hand-written rules: flag transactions over $1,000, block purchases from certain countries, require verification for new accounts. These rules catch known fraud patterns but miss anything creative. Fraudsters adapt quickly, and every new rule you write is a reaction to a technique they have already moved past.

Anomaly detection takes a fundamentally different approach. Instead of defining what fraud looks like, you define what normal looks like. The model learns the full range of legitimate transaction patterns from your historical data. Anything that falls outside that learned normal range gets flagged, regardless of what kind of fraud it is. This catches both known and unknown fraud techniques.

What Data to Train On

Train your anomaly detection model on legitimate transactions only. The model needs to learn what normal looks like, so feed it a clean dataset of transactions you know are good. Include features that describe the transaction in multiple dimensions:

The more dimensions you include, the more types of anomalies the model can detect. A transaction might look normal on any single metric but be anomalous when you consider all metrics together. See How to Prepare Your Data for data preparation guidance.

Building the Fraud Detection Model

Step 1: Export clean transaction history.
Pull at least 1,000 legitimate transactions (the more the better) with the features listed above. Remove any known fraudulent transactions from this dataset. You want the model to learn only what normal looks like.
Step 2: Upload and select isolation forest.
Upload the CSV to the Data Aggregator app. Choose isolation forest as the algorithm. It is the best general-purpose anomaly detector and handles high-dimensional transaction data well. There is no target column for anomaly detection because you are training on normal data only.
Step 3: Set contamination rate.
The contamination parameter tells the model what percentage of your data might be anomalous. For fraud detection, start with 1-5% depending on your expected fraud rate. A lower contamination means fewer false positives but might miss some fraud. A higher contamination catches more fraud but generates more alerts to investigate.
Step 4: Train and validate.
Train the model and then test it against a held-out set of transactions that includes both legitimate and known fraudulent ones. Check how many of the known fraudulent transactions the model correctly flags as anomalies (recall) and how many legitimate transactions it incorrectly flags (false positive rate).
Step 5: Deploy for real-time scoring.
Send each new transaction through the model before processing it. The model returns an anomaly score. Transactions above your threshold get flagged for manual review or automatic blocking. Because scoring costs zero credits, there is no cost tradeoff to checking every transaction.

Handling Flagged Transactions

Not every flagged transaction is actual fraud. The model flags anything unusual, which could include legitimate customers making unusually large purchases, buying gifts at unexpected times, or shopping from a new location. Design your response based on risk level:

Integrate this scoring into your workflow automation to route flagged transactions automatically. A chain command can score the transaction, check the anomaly level, and trigger the appropriate response without manual intervention.

False positives are expected. Any fraud detection system will flag some legitimate transactions. The goal is to minimize false positives while catching as much real fraud as possible. Track your false positive rate and adjust the contamination parameter and score thresholds over time. A 5% false positive rate that catches 90% of fraud is usually a good tradeoff.

Beyond Payment Fraud

The same anomaly detection approach works for other types of fraud and abuse:

Catch fraud before it costs you money. Train an anomaly detection model on your transaction data and score every transaction for free.

Get Started Free