A Comprehensive Guide on Median
Published by
at September 21st, 2021 , Revised On July 20, 2023What is Median?
The word ‘median’ means exactly what its counterpart, ‘medium,’ means. It is the central value in several given values. The values, when taken all together, form what is called a dataset.
Median is that value in a dataset under which 50@% or half of the values from the set fall. It’s the middle value. That middle can have numerous numbers on either side of it. And they can be arranged in an ascending, descending, or no specific order at all.
How is median calculated?
Calculating the median is just as straightforward as the mean. There are two methods to calculate the median for any dataset, as listed below:
Method 1 – The Direct Method for Odd-Numbered Datasets
As the name suggests, this method involves identifying the median value directly, by simply looking at the number of values in the given dataset. It is also referred to as the short-hand method in some school textbooks.
Example #1:
In the following dataset, the median is the middle value, which is 30. It has two values on the right as well as the left side.
10 20 30 40 50
Example #2:
Similarly, in this dataset,
2.5 9.35 3.66 3.33 5.22 4.666 9.44 0.224 10.3
the values have to be first arranged in ascending order. The new dataset then becomes:
0.224 2.5 3.33 3.66 4.666 5.22 9.35 9.44 10.3
The median of this rearranged dataset is 4.666, with an equal number of values on either side of it.
Important points to remember: Firstly, the direct method to calculate median is only suitable for a dataset that contains an odd number of values in it (such as 5 and 9 in the above examples). in such a case, an equal number of values will be on the left and right side of the central value (median). Therefore, it’s easier to identify the median from such a dataset. Secondly, ordering the values in ascending order is the first step to calculate the median accurately.
Method 2 – The Indirect Method for Even- and Odd-Numbered Datasets
This method makes use of the statistical formula to calculate the median. Even though it can be used for a dataset containing an odd number of values, it’s officially suitable and to be used for a dataset containing an even number of values. The formula is:
{(n + 1) ÷ 2}th
Where,
- ‘n’ represents the total number of values in the dataset and
- ‘th’ simply represents the nth number, such as 8th, 10th, etc.
Example 3:
Taking the same dataset as used in example #1 above, the median would be the same, of course, even by applying the formula. It can be calculated as follows:
10 20 30 40 50
These are 5 values, so placing that in the formula,
{(5 + 1) ÷ 2}th
{(5 + 1) ÷ 2}th
{(6) ÷ 2}th = 3
The median, therefore, is the 3rd value in the dataset, which is 30.
Method 3 – The Indirect Method for Even-Numbered Datasets
Example 4:
The median for the following even number of values in a dataset can be calculated as such:
2.5 9.35 3.66 3.33 4.666 9.44 0.224 10.3
Rearranging these values in ascending order, the new dataset becomes:
0.224 2.5 3.33 3.66 4.666 5.22 9.35 9.44
The formulae are slightly different for calculating median of an even-numbered dataset, and they are:
Formula # 1: (n/2)th observation + (n/2 +1)th observation) / 2
Formula #2: Adding the 2 central values and dividing by 2.
Formula #3 (only for large datasets): n/2 -1
Using formula #1 for the above dataset:
0.224 2.5 3.33 3.66 4.666 5.22 9.35 9.44
(8/2)th observation + (8/2 +1)th observation) / 2
(4)th observation + (5)th observation) / 2
(4 + 5)th observation / 2 = 4.5
The media is therefore 4.5 for the above dataset. Even though there’s no such value in the dataset, it’s assumed that the median lies around 4.5. Using formula #2, simply adding the central two values and dividing by 2, the median comes at:
3.66 + 4.666 ÷ 2 = 4.163.
And lastly, according to formula #3 which is most suitable for a large number of values in a dataset, such as 150, the median would be:
150 ÷ 2 = 75 – 1 = 74 (median).
Where is median used?
Fields like finance, maths, statistics, accounting, and even academic research make regular use of median. In daily life, though the median is calculated, for instance, when a country discloses how much its government pays its workers on average. Median best describes such a thing as it represents the exact central value from a very large dataset.
Is calculating the median important? Why or why not?
Just like mean, the median is just as important. In fact, the use of median starts where the use of mean ends. Its importance can be gauged from the following main reasons:
- Where mean only allows very high or low values to be averaged, median identifies the central value in very large sets of data more accurately.
- Median is not affected much by how big or small a dataset is, unlike the mean, which is best for not-so-large datasets.
- Median is very ‘robust’ in statistics because it separates values in a dataset—small or large—on a 50% basis.
- If there are more anomalies in a dataset, otherwise called outliers—very large or very small values that can skew the central value of the dataset too far to the right or to the left—calculating median accounts for an accurate result.
- It’s easy to plot graphs based on calculations obtained from the median, as it allows for plotting skewed values too.
When is median NOT calculated?
In situations where an average alone gives the central value, the median does not need to be calculated. If a dataset does not contain any anomalies, then also a median need not be calculated. Last but not the least, calculating the median can be tedious work especially when it’s for an even-numbered dataset.
Frequently Asked Questions
Both are accurate, but the choice to use mean or median depends on the kind of datasets available.
No, because mean is simply the average or sum of all the numbers in a dataset, divided by the total number of values in the set. A median, however, is the central value, and that value may not always be one of the values explicitly included in the dataset. Furthermore, the median can be of odd-numbered data sets too.
Median is denoted as either ‘M’ or as ‘X’ which is read as ‘childa.’
Calculating median is preferred most over mean in situations where there are one or more than one outliers in the datasets. The outliers can be either very small or large values either to the left or right. That means, there are either too many numbers on the right of a certain value (the outlier) or to its left. For instance, in the dataset below, there is one outlier, -10, with 6 values to its right. Such a dataset can’t be plotted on a bell graph, either.
-10 10 20 30 40 50 60