How to Calculate Variance in NumPy (np.var with Population, Sample and ddof Examples)

This guide shows how to calculate variance in NumPy using the np.var() function, which handles both population variance (default) and sample variance with the ddof parameter. Variance is a key statistical measure that helps to understand how data points are spread out.

numpy variance

Variance Calculation in NumPy

NumPy’s np.var() makes it simple to calculate variance in NumPy arrays, with the ddof=1 option providing the unbiased sample variance correction essential for statistical analysis. NumPy provides a built-in function, var, to calculate the variance of an array.

See also  How to create identity matrix in Numpy?

Here’s an example to calculate variance using NumPy:

import numpy as np

my_array = np.array([1, 5, 7, 5, 43, 43, 8, 43, 6])

variance = np.var(my_array)
print("Variance equals: " + str(round(variance, 2)))

In this example, the var function calculates the variance of the array my_array.

How to calculate population variance and sample variance

The var function in NumPy can compute both population variance and sample variance. By default, the function calculates the population variance, but you can adjust it to compute the sample variance by modifying the ddof (Delta Degrees of Freedom) parameter.

# Population variance
population_variance = np.var(my_array)

# Sample variance
sample_variance = np.var(my_array, ddof=1)

Setting ddof=1 adjusts the denominator from N to N−1. This correction is crucial for sample variance because it provides an unbiased estimate of the true population variance when you only have a subset of the data.

See also  How to Inverse Matrix in Numpy with Python

Check also:
how to calculate a Variance in Excel