Identify outliers using IQR, Z-score, and Modified Z-score methods with visualization
An outlier is a data point that differs significantly from other observations. Outliers can occur due to measurement errors, data entry errors, or genuine extreme values in the population.
Formula:
Advantages:
When to use: General-purpose outlier detection, especially for skewed data
Formula:
Advantages:
Disadvantages:
When to use: Large datasets that are approximately normally distributed
Formula:
Advantages:
When to use: When you suspect outliers but want a robust method
| Scenario | Recommended Method |
|---|---|
| General purpose | IQR (k=1.5) |
| Normal distribution | Z-Score |
| Skewed data | IQR or Modified Z |
| Small sample | Modified Z |
| Many outliers | Modified Z |
| Conservative (few false positives) | IQR (k=3.0) |
Enter your data and select a method to detect outliers
Every coffee helps keep the servers running. Every book sale funds the next tool I'm dreaming up. You're not just supporting a site — you're helping me build what developers actually need.
Use IQR (Tukey fences) for robust, distribution‑free detection; Z‑score for roughly normal data; Modified Z‑score (MAD) for added robustness against outliers.
IQR: outside [Q1−1.5·IQR, Q3+1.5·IQR]. Z‑score: |Z| ≥ 3 (sometimes 2.5). Modified Z: |Mz| ≥ 3.5 are typical starting points.
Not automatically. Investigate causes (entry errors, different process). Consider robust summaries or transformations if outliers are genuine.
No. They may reflect rare but valid cases. Always use domain knowledge before excluding points.