Chi-Square Calculator
Chi-Square Statistic
—
Degrees of Freedom
—
Significance (approx.)
—
What Is the Chi-Square Test?
The chi-square test is a nonparametric statistical hypothesis test that determines whether observed frequencies in categorical data differ significantly from expected frequencies. According to the NIST Engineering Statistics Handbook, the chi-square distribution is one of the most widely used probability distributions in inferential statistics, forming the basis for tests of variance, goodness of fit, and independence. Developed by Karl Pearson in 1900, it remains one of the most widely used nonparametric tests in research. The test statistic is calculated by summing the squared differences between observed (O) and expected (E) values, each divided by the expected value: chi-square = sum of (O - E)^2 / E. A larger chi-square value indicates a greater discrepancy between what was observed and what was expected under the null hypothesis.
This calculator accepts comma-separated lists of observed and expected values, computes the chi-square statistic, determines the degrees of freedom, and provides an approximate significance level. The result tells you whether the observed pattern could plausibly have occurred by random chance, or whether there is evidence of a real effect or relationship in your data. The chi-square distribution is always right-skewed and takes only non-negative values, with its shape determined by the degrees of freedom parameter.
The Chi-Square Goodness-of-Fit Test
The goodness-of-fit test checks whether a single categorical variable follows a hypothesized distribution. For example, a manufacturer claims that bags of mixed candy contain equal proportions of five colors. You count 200 candies and find: red = 52, blue = 38, green = 44, yellow = 36, orange = 30. If the claim is true, you would expect 40 of each color. The chi-square statistic measures how far the observed counts deviate from these expected counts. With 4 degrees of freedom (5 categories minus 1), you compare the statistic to a chi-square distribution table to determine the p-value.
The goodness-of-fit test can test whether data follows any specified distribution, not just a uniform one. You might test whether the distribution of blood types in a sample matches known population proportions, whether dice rolls follow a uniform distribution to check for bias, or whether the number of customer arrivals per hour follows a Poisson distribution. The key requirement is that you have a clear expected distribution to compare against your observed data. Each expected frequency should ideally be at least 5 for the chi-square approximation to be reliable.
The Chi-Square Test of Independence
The test of independence uses a contingency table (also called a cross-tabulation) to determine whether two categorical variables are associated. For example, you might test whether there is a relationship between gender and product preference, between treatment type and recovery outcome, or between education level and voting behavior. The null hypothesis states that the two variables are independent -- knowing the value of one variable provides no information about the other.
To calculate expected values for a contingency table, multiply the row total by the column total and divide by the grand total for each cell. The degrees of freedom equal (number of rows minus 1) times (number of columns minus 1). For a 2x3 table, that gives (2-1)(3-1) = 2 degrees of freedom. If the chi-square statistic exceeds the critical value at your chosen significance level (commonly 0.05), you reject the null hypothesis and conclude that the variables are not independent. This does not tell you the strength or direction of the association -- for that, you need additional measures like Cramer's V or the phi coefficient.
Understanding Degrees of Freedom
Degrees of freedom (df) represent the number of values in the calculation that are free to vary. In a goodness-of-fit test with k categories, df = k - 1 because once you know k-1 of the observed frequencies and the total, the last frequency is determined. In a contingency table with r rows and c columns, df = (r-1)(c-1) because the row and column totals constrain the cell values. The degrees of freedom parameter determines which chi-square distribution to use when evaluating statistical significance.
As degrees of freedom increase, the chi-square distribution becomes more spread out and its peak shifts to the right. The mean of a chi-square distribution equals its degrees of freedom, and the variance equals twice the degrees of freedom. For large df values (above about 30), the chi-square distribution is well approximated by a normal distribution with mean df and standard deviation sqrt(2*df). This is why critical values grow roughly linearly with degrees of freedom for a fixed significance level.
Chi-Square Critical Values Reference Table
The following table, based on values published by the NIST/SEMATECH e-Handbook of Statistical Methods, shows critical values for common degrees of freedom and significance levels. If your calculated chi-square statistic exceeds the value in the table, the result is statistically significant at that alpha level. You can also use our p-value calculator for exact probability computations.
| df | alpha = 0.10 | alpha = 0.05 | alpha = 0.01 | alpha = 0.001 |
|---|---|---|---|---|
| 1 | 2.706 | 3.841 | 6.635 | 10.828 |
| 2 | 4.605 | 5.991 | 9.210 | 13.816 |
| 3 | 6.251 | 7.815 | 11.345 | 16.266 |
| 4 | 7.779 | 9.488 | 13.277 | 18.467 |
| 5 | 9.236 | 11.070 | 15.086 | 20.515 |
| 10 | 15.987 | 18.307 | 23.209 | 29.588 |
| 15 | 22.307 | 24.996 | 30.578 | 37.697 |
| 20 | 28.412 | 31.410 | 37.566 | 45.315 |
Chi-Square Critical Values and P-Values
The critical value is the threshold that the chi-square statistic must exceed to reject the null hypothesis at a given significance level (alpha). Common significance levels are 0.10, 0.05, 0.01, and 0.001. For example, with 3 degrees of freedom, the critical values are 6.251 (alpha = 0.10), 7.815 (alpha = 0.05), 11.345 (alpha = 0.01), and 16.266 (alpha = 0.001). If your calculated chi-square exceeds 7.815 with 3 df, you have evidence at the 5% significance level that the observed distribution differs from the expected one.
The p-value gives the probability of obtaining a chi-square statistic at least as extreme as the one calculated, assuming the null hypothesis is true. A p-value below your chosen alpha level leads to rejection of the null hypothesis. For instance, a p-value of 0.023 means there is only a 2.3% chance of seeing such a large discrepancy between observed and expected values if the null hypothesis were true. Most statistical software computes exact p-values, but this calculator provides approximate significance ranges that are useful for quick assessments.
Common Applications in Research
In medical research, chi-square tests are used to compare treatment outcomes across groups -- for example, testing whether a new drug leads to significantly different recovery rates compared to a placebo. In genetics, the chi-square goodness-of-fit test checks whether observed offspring ratios match the expected Mendelian ratios (such as the classic 3:1 ratio for dominant and recessive traits). In quality control, manufacturers use the test to verify that defect rates across production lines or time periods are consistent with expectations.
Market researchers use chi-square tests to analyze survey data -- for instance, testing whether brand preference varies by age group, or whether purchase behavior differs across geographic regions. In ecology, the test can determine whether species are distributed randomly across habitats or show preference for certain environments. In social science, researchers test whether there are associations between demographic variables like education level and health outcomes, or between socioeconomic status and political affiliation.
Assumptions and Limitations
The chi-square test requires several assumptions to be valid. First, observations must be independent -- each data point should represent a separate, unrelated individual or event. Second, the data must be categorical (nominal or ordinal), not continuous. Third, expected frequencies should be at least 5 in each cell; when they fall below this threshold, the chi-square approximation becomes unreliable. For 2x2 tables with small expected frequencies, Yates' continuity correction can be applied, or Fisher's exact test can be used instead.
The chi-square test tells you whether a statistically significant association exists, but not how strong or meaningful it is. A very large sample can produce a statistically significant result even when the actual difference is trivially small. Conversely, a small sample may fail to detect a real difference (low statistical power). To assess effect size, use measures like Cramer's V (which ranges from 0 to 1, where 0 means no association and 1 means perfect association), or compute odds ratios for 2x2 tables. Always report effect size alongside statistical significance for a complete picture.
Step-by-Step Example
Suppose a teacher wants to know whether a die is fair. She rolls it 60 times and records: 1 appears 14 times, 2 appears 8 times, 3 appears 9 times, 4 appears 12 times, 5 appears 7 times, and 6 appears 10 times. For a fair die, each face should appear 10 times (60 rolls / 6 faces). The chi-square statistic is (14-10)^2/10 + (8-10)^2/10 + (9-10)^2/10 + (12-10)^2/10 + (7-10)^2/10 + (10-10)^2/10 = 1.6 + 0.4 + 0.1 + 0.4 + 0.9 + 0.0 = 3.4.
With 5 degrees of freedom (6 categories minus 1), the critical value at alpha = 0.05 is 11.07. Since 3.4 is well below 11.07, we fail to reject the null hypothesis -- there is not enough evidence to conclude the die is unfair. The p-value is approximately 0.64, meaning there is a 64% chance of seeing this much variation even with a perfectly fair die. This example illustrates that some variation from expected values is normal and does not imply bias or an underlying pattern.
Frequently Asked Questions
What is the chi-square test used for?
The chi-square test determines whether observed categorical data differs significantly from what would be expected by chance. It is used in two main forms: the goodness-of-fit test checks whether data follows a hypothesized distribution, and the test of independence checks whether two categorical variables in a contingency table are associated. Researchers in medicine, genetics, market research, and social science routinely use this test because it requires no assumptions about the underlying distribution of the data.
How do you calculate degrees of freedom for a chi-square test?
For a goodness-of-fit test, degrees of freedom equals the number of categories minus 1 (df = k - 1). For a test of independence using a contingency table, degrees of freedom equals (number of rows minus 1) times (number of columns minus 1), or df = (r-1)(c-1). For example, a 3x4 contingency table has (3-1)(4-1) = 6 degrees of freedom. The degrees of freedom parameter determines which chi-square distribution is used to evaluate the p-value.
What does a high chi-square value mean?
A high chi-square value indicates a large discrepancy between observed and expected frequencies. If the chi-square statistic exceeds the critical value for your chosen significance level (commonly 0.05) and degrees of freedom, you reject the null hypothesis. For example, with 3 degrees of freedom, a chi-square value above 7.815 is significant at the 5% level. However, statistical significance does not necessarily imply practical importance -- use effect size measures like Cramer's V or correlation coefficients to assess magnitude.
When should you not use the chi-square test?
The chi-square test should not be used when expected frequencies in any cell are below 5, as the approximation becomes unreliable. In such cases, Fisher's exact test is the recommended alternative. The test also requires independent observations (no repeated measures), categorical data (not continuous variables), and sufficient total sample size. For 2x2 tables with small samples, Yates' continuity correction can improve accuracy.
What is the difference between chi-square and t-test?
The chi-square test analyzes relationships between categorical variables (e.g., treatment group vs. outcome category), while the t-test compares means of continuous numerical variables between two groups. For example, testing whether gender affects product preference uses chi-square, while testing whether a drug changes average blood pressure uses a t-test. If your data consists of counts or frequencies in categories, use chi-square. If your data consists of measured numerical values, use a t-test or ANOVA.
How do I interpret the p-value from a chi-square test?
The p-value represents the probability of observing a chi-square statistic as extreme as yours if the null hypothesis were true. A p-value below 0.05 (the conventional threshold) means there is less than a 5% chance the observed pattern occurred by random chance alone, leading you to reject the null hypothesis. For example, a p-value of 0.003 provides strong evidence against the null hypothesis. However, p-values do not measure effect size -- a very large sample can produce a tiny p-value even for trivially small differences.