How to perform Chi-Squared test using SciPy?

The Chi-Squared test is commonly used to test hypotheses about the distribution of categorical data. We’ll show how to perform a Chi-Squared test in Python using the SciPy library.

python scipy chi squared test

Chi-square calculations

We’ll demonstrate how to conduct a chi-squared test for a given set of observed frequencies using the SciPy library. We’ll calculate the chi-squared test using the SciPy library, as it provides the built-in chisquare function for this purpose.

The chisquare function returns a p-value. Setting a significance level of 5% helps us determine if the observed deviations from expected frequencies are statistically significant.

from scipy import stats

numbers = [45, 67, 33, 54, 33]

chi = stats.chisquare(numbers)

print(f"pvalue equals {round(chi.pvalue, 4)}")
if chi.pvalue < 0.05:
    print("Hypothesis can be rejected")
else:
    print("Hypothesis is valid")

We compare the p-value against a significance level of 0.05. If the p-value is less than 0.05, the null hypothesis can be rejected, meaning the deviations are statistically significant.

This indicates that the deviations between the observed frequencies are statistically significant, allowing us to reject the null hypothesis.

It helps determine if variations in categorical data are due to chance or a specific factor.