Friday, February 6, 2026

Mean, Median and Mode With Python Examples

This post explains Mean, Median and Mode which are measures of central tendency and help to summarize the data. Here measure of central tendency is a value which identifies the middle position with in a set of data.

Here we'll look at how to calculate mean, median and mode and which one is more appropriate in the given scenario.

Mean

Mean, which is the arithmetic average is calculated by summing all the values in the data set divided by the number of values in the data set. If there are n values ranging from \(x_1, x_2, \dots, x_n \) then the mean \( \overline{x} \) (x bar) is calculated as:

$$ \bar{x} = \frac{x_1 + x_2 + \cdots + x_n} {n}$$

Using summation notation same thing can be written as:

$$ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i $$

For example, if we have a data set of 10 values as given below-

5, 8, 12, 15, 18, 20, 22, 25, 30, 35

Then the sum of the values is-

5 + 8 + 12 + 15 + 18 + 20 + 22 + 25 + 30 + 35 = 190

And the mean is \( \overline{x} \) = 190/10 = 19

Mean is a better choice when data is normally distributed.

When mean is not a better choice

Mean may not be a best choice when data is skewed as mean is sensitive to outliers. In skewed data, outliers (very high or low values) can drag the mean away from the center.

For example, if values are- 10, 12, 13, 14, 15, 100

Then the mean is- 164/6 = 27.33

As you can see mean is pulled away from the center because of one extreme value 100. In such cases median is better option.

Median

Median is the middle value in an ordered (ascending or descending) set of data. Formula for median is as given below-

  1. If the dataset has an odd number of values, it is the middle value. $$\left(\frac{n+1}{2}\right)^\text{th}\text{value} $$
  2. If the dataset has an even number of values, it is the average of the two middle values. $$ \text{Median} = \frac{\left(\frac{n}{2}\right)^\text{th}\text{value} + \left(\left(\frac{n}{2}\right)+1\right)^\text{th}\text{value}}{2} $$

For example, in order to calculate median for

5, 15, 18, 20, 22, 35, 8, 12, 25, 30

First sort them in ascending order-

5, 8, 12, 15, 18, 20, 22, 25, 30, 35

Number of values is 10 (even) so the median is-

\(\frac{\left(\frac{10}{2}\right)\text{th value} + \left(\left(\frac{10}{2}\right)+1\right)\text{th value}}{2} = \frac{5^{\text{th}} \text{ value} + 6^{\text{th}} \text{ value}}{2} \) = (18+20)/2 = 19

So, the median of the dataset is 19.

Median responds well to the skewed data

Earlier we have seen that the mean is sensitive to the outliers whereas median doesn't vary.

For example, if values are- 10, 12, 13, 14, 15, 100

Then the median = (13 + 14)/2 = 13.5

Which is close to center.

Mode

The mode is the most frequent value in the dataset. For example, if we have the following list of values

2, 4, 4, 5, 7, 7, 7, 8, 9, 10

Then 7 is the mode as that has the highest frequency 3.

Mode is not sensitive to outliers.

We may have a scenario where all values appear exactly once meaning no mode. We may also have a scenario where 2 or more values have the same frequency meaning multiple modes.

Mode is the best measure of central tendency when you're dealing with categorical data (non-numerical), or when you want to identify the most common value in a dataset. For example, you want to find the most shopped brand or most preferred colour.

When you want the most typical value, for example most bought shoe size.

Shoe sizes: [7, 8, 8, 8, 9, 10]

Here mode = 8 (most common size)

Calculating mean, median, mode using Python libraries

1. NumPy library has mean and median functions to calculate mean and median. For mode SciPy library provides mode method.

import numpy as np
from scipy import stats
values = [2, 4, 4, 5, 7, 7, 7, 8, 9, 10]
#Mean and Median
mean = np.mean(values)
median = np.median(values)
print('Mean is', mean)
print('Median is', median)
#Mode = returns an array of mode and count
mode = stats.mode(values)
print('Mode is', mode[0], 'count is', mode[1])

Output

Mean is 6.3
Median is 7.0
Mode is 7 count is 3

2. Using Pandas library which has mean, median and mode functions. You can convert list of values to Pandas series and then calculate mean, median and mode.

import pandas as pd
values = [2, 4, 4, 4, 5, 7, 7, 7, 8, 9, 10]
data = pd.Series(values)
mean = data.mean()
median = data.median()
# returns a Series (which can have multiple modes)
mode = data.mode()
print(f"Mean is {mean:.2f}")
print('Median is', median)
print('Mode is', list(mode))

Output

Mean is 6.09
Median is 7.0
Mode is [4, 7]

That's all for this topic Mean, Median and Mode With Python Examples. If you have any doubt or any suggestions to make please drop a comment. Thanks!

>>>Return to Python Tutorial Page


Related Topics

  1. Python Installation on Windows
  2. Encapsulation in Python
  3. Method Overriding in Python
  4. Multiple Inheritance in Python
  5. Simple Linear Regression With Example

You may also like-

  1. Passing Object of The Class as Parameter in Python
  2. Local, Nonlocal And Global Variables in Python
  3. Python count() method - Counting Substrings
  4. Python Functions : Returning Multiple Values
  5. Marker Interface in Java
  6. Functional Interfaces in Java
  7. Difference Between Checked And Unchecked Exceptions in Java
  8. Race Condition in Java Multi-Threading

No comments:

Post a Comment