How to calculate variance in Numpy?

We will explore how to calculate variance using the NumPy Python module. Variance is a key statistical measure that helps to understand how data points are spread out.

numpy variance

Variance calculation

Calculating variance in Python is straightforward, and with NumPy, it becomes even more convenient. NumPy provides a built-in function, var, to calculate the variance of an array.

Here’s an example to calculate variance using NumPy:

import numpy as np

my_array = np.array([1, 5, 7, 5, 43, 43, 8, 43, 6])

variance = np.var(my_array)
print("Variance equals: " + str(round(variance, 2)))

In this example, the var function calculates the variance of the array my_array.

How to calculate population variance and sample variance

The var function in NumPy can compute both population variance and sample variance. By default, the function calculates the population variance, but you can adjust it to compute the sample variance by modifying the ddof (Delta Degrees of Freedom) parameter.

# Population variance
population_variance = np.var(my_array)

# Sample variance
sample_variance = np.var(my_array, ddof=1)

Setting ddof=1 adjusts the denominator to reflect the number of samples rather than the entire population, giving a more accurate estimate when dealing with sample data.

Check also:
how to calculate a Variance in Excel