Main content
Identifying outliers with the 1.5xIQR rule
An outlier is a data point that lies outside the overall pattern in a distribution.
The distribution below shows the scores on a driver's test for applicants. How many outliers do you see?
Some people may say there are outliers, but someone else might disagree and say there are or outliers. Statisticians have developed many ways to identify what should and shouldn't be called an outlier.
A commonly used rule says that a data point is an outlier if it is more than above the third quartile or below the first quartile. Said differently, low outliers are below and high outliers are above .
Let's try it out on the distribution from above.
Step 1) Find the median, quartiles, and interquartile range
Here are the scores listed out.
Step 2) Calculate below the first quartile and check for low outliers.
Step 3) Calculate above the third quartile and check for high outliers.
Bonus learning: Showing outliers in box and whisker plots
Box and whisker plots will often show outliers as dots that are separate from the rest of the plot.
Here's a box and whisker plot of the distribution from above that does not show outliers.
Here's a box and whisker plot of the same distribution that does show outliers.
Notice how the outliers are shown as dots, and the whisker had to change. The whisker extends to the farthest point in the data set that wasn't an outlier, which was .
Here's the original data set again for comparison.
Want to join the conversation?
No posts yet.