How to calculate mode in Python?

Let’s see how to calculate mode in Python.

mode python

In statistics, the mode is the value that appears most often in a dataset. It is one of the measures of central tendency, alongside the mean and median. The mode is particularly useful for categorical data but can also be applied to numerical data.

Mode in Python

To calculate the mode, we need to import the statistics module.

Luckily, there is dedicated function in statistics module to calculate mode.

import statistics as s

x = [1, 5, 7, 5, 8, 43, 6]

mode = s.mode(x)
print("Mode equals: " + str(mode))

Mode in Numpy

It was how to calculate mode in Python. However, calculating the mode directly with NumPy requires a workaround since NumPy does not have a built-in mode function.

import numpy as np

my_array = np.array([1, 2, 4, 4, 7, 7, 7, 20])

mode = np.argmax(np.bincount(my_array))

print(f"Mode equals: {mode}")

Thanks to this mode = np.argmax(np.bincount(my_array)) easy trick mode has been calculated.

Numpy mode calculations

How to calculate the mode of an array in NumPy?

In addition to using the statistics module to calculate the mode of a list in Python, you can also use the np.argmax and np.bincount functions in NumPy. The np.argmax function takes an array as a parameter and returns the index of the element with the maximum value. The np.bincount function takes an array as a parameter and returns a count of the number of times each element appears in the array.

To calculate the mode of an array in NumPy, you can use the following code:

import numpy as np

my_array = np.array([1, 2, 4, 4, 7, 7, 7, 20])

mode = np.argmax(np.bincount(my_array))

print(f"Mode equals: {mode}")

This code will print the following output:

Mode equals: 7

As you can see, the mode of the array my_array is 7. This is because the element 7 appears more often than any other element in the array.

Understanding the NumPy Mode Workaround

Let’s break down the NumPy code used to calculate the mode:

np.bincount(my_array): This function counts the occurrences of each non-negative integer value in the input array my_array. It returns an array where the index i holds the count of the value i in my_array. For example, if my_array is [1, 2, 4, 4, 7, 7, 7, 20], np.bincount(my_array) will produce an array like [0, 1, 1, 0, 2, 0, 0, 3, 0, ..., 0] (the exact length depends on the maximum value in my_array, with index 0 having count 0, index 1 count 1, index 2 count 1, index 4 count 2, index 7 count 3, and zeros for missing values up to 20).
np.argmax(...): The np.argmax() function then takes the output of np.bincount() as input and returns the *index* of the maximum value in that array. Since the index in the bincount output array corresponds to the value from the original array, np.argmax(np.bincount(my_array)) effectively finds the value (index) with the highest count (maximum value in the bincount array), which is the mode.

It is important to note that this NumPy workaround using np.bincount and np.argmax is primarily effective for non-negative integer arrays. It may not be directly suitable for arrays containing floating-point numbers or negative integers without preprocessing the data.