Test of independence for contingency tables and goodness-of-fit. Results are formatted in APA 7th edition style.
The chi-square (χ²) test is a non-parametric statistical test used to examine relationships between categorical variables. Unlike t-tests or ANOVA that compare means, the chi-square test works with frequency counts—how many observations fall into each category. It compares the frequencies you actually observed in your data to the frequencies you would expect if there were no relationship between the variables. When the difference between observed and expected frequencies is large enough, you can conclude the variables are significantly associated.
Use the test of independence to determine whether two categorical variables are related. The data are arranged in a contingency table (cross-tabulation) where rows represent one variable and columns represent the other. For example, you might test whether there is a relationship between gender and product preference, or between treatment condition and recovery outcome. The null hypothesis states that the two variables are independent—knowing the value of one variable tells you nothing about the other.
Use the goodness-of-fit test to determine whether observed frequencies for a single categorical variable differ from a set of expected frequencies. For example, testing if a die is fair by comparing observed rolls to the expected equal distribution (1/6 for each face), or testing whether customer visits are evenly distributed across days of the week. The null hypothesis states that the observed distribution matches the expected distribution.
A researcher surveyed 100 people to test whether gender (Male / Female) is associated with product preference (A / B / C). The observed frequencies are:
| Observed | Product A | Product B | Product C | Row Total |
|---|---|---|---|---|
| Male | 30 | 10 | 10 | 50 |
| Female | 15 | 20 | 15 | 50 |
| Column Total | 45 | 30 | 25 | 100 |
Expected frequencies are calculated as (Row Total × Column Total) / Grand Total. For example, the expected frequency for Male × Product A = (50 × 45) / 100 = 22.5.
| Expected | Product A | Product B | Product C |
|---|---|---|---|
| Male | 22.5 | 15.0 | 12.5 |
| Female | 22.5 | 15.0 | 12.5 |
Results
χ²(2, N = 100) = 8.74, p = .013, Cramer's V = .30
There was a statistically significant association between gender and product preference, χ²(2, N = 100) = 8.74, p = .013, with a medium effect size (Cramer's V = .30). Males showed a stronger preference for Product A, while females were more evenly distributed across products.
Choosing the right test depends on the type of data you have and the size of your sample. Use this guide to select the appropriate test:
| Situation | Recommended Test |
|---|---|
| Two categorical variables (2×2 or larger table) | Chi-square test of independence |
| One categorical variable vs. expected proportions | Chi-square goodness-of-fit test |
| 2×2 table with any expected frequency < 5 | Fisher's exact test |
| Ordinal data, two independent groups | Mann-Whitney U test |
| Paired or matched categorical data | McNemar's test |
| More than two related categorical samples | Cochran's Q test |
Before interpreting your chi-square results, verify that these assumptions are met:
1. Categorical Data
Both variables must be categorical (nominal or ordinal). The chi-square test does not work with continuous data. If you have continuous measurements, you must first categorize them into groups (e.g., age → age ranges), though this results in a loss of information.
2. Independent Observations
Each observation must be independent of all others. This means each participant or case contributes to only one cell in the contingency table. Repeated measures or matched pairs violate this assumption—use McNemar's test instead.
3. Expected Frequency ≥ 5
All expected cell frequencies should be 5 or greater. When more than 20% of cells have expected frequencies below 5, the chi-square approximation becomes unreliable. In such cases, consider combining categories or using Fisher's exact test (for 2×2 tables).
4. Random Sampling
Data should be collected through random sampling or random assignment to ensure the sample is representative of the population. Convenience or biased samples can lead to misleading results regardless of what the test shows.
While the p-value tells you whether an association is statistically significant, Cramer's V tells you how strong the association is. This is critical because with large sample sizes, even trivial associations can reach statistical significance. Cramer's V ranges from 0 (no association) to 1 (perfect association), and its interpretation depends on the degrees of freedom (the smaller of rows − 1 or columns − 1):
| Effect Size | df* = 1 | df* = 2 | df* = 3 | df* ≥ 4 |
|---|---|---|---|---|
| Small | .10 | .07 | .06 | .05 |
| Medium | .30 | .21 | .17 | .15 |
| Large | .50 | .35 | .29 | .25 |
*df* = min(rows − 1, columns − 1). For our worked example above (2×3 table), df* = 1, so V = .30 represents a medium effect.
According to APA 7th edition guidelines, chi-square results should include the chi-square statistic, degrees of freedom, sample size, p-value, and an effect size measure. Here is a template and a real example:
Template
A chi-square test of independence was conducted to examine the relationship between [Variable 1] and [Variable 2]. The relation between these variables was [significant/not significant], χ²(df, N = XX) = X.XX, p = .XXX, Cramer's V = .XX.
Real Example (from the worked example above)
A chi-square test of independence was conducted to examine the relationship between gender and product preference. The relation between these variables was significant, χ²(2, N = 100) = 8.74, p = .013, Cramer's V = .30. Males showed a notably higher preference for Product A (60%) compared to females (30%), while females were more evenly distributed across all three products.
Note: Report χ² values to two decimal places. Report p-values to three decimal places, except use p < .001 when the value is below .001. Always include an effect size measure (Cramer's V for independence tests).
StatMate's chi-square calculations have been validated against R's chisq.test() function and SPSS output. We use the jstat library for chi-square probability distributions and compute expected frequencies, degrees of freedom, and Cramer's V following standard statistical formulas. All results match R output to at least 4 decimal places.
T-Test
Compare means between two groups
ANOVA
Compare means across 3+ groups
Correlation
Measure relationship strength
Descriptive
Summarize your data
Sample Size
Power analysis & sample planning
One-Sample T
Test against a known value
Mann-Whitney U
Non-parametric group comparison
Wilcoxon
Non-parametric paired test
Regression
Model X-Y relationships
Multiple Regression
Multiple predictors
Cronbach's Alpha
Scale reliability
Logistic Regression
Binary outcome prediction
Factor Analysis
Explore latent factor structure
Kruskal-Wallis
Non-parametric 3+ group comparison
Repeated Measures
Within-subjects ANOVA
Two-Way ANOVA
Factorial design analysis
Friedman Test
Non-parametric repeated measures
Fisher's Exact
Exact test for 2×2 tables
McNemar Test
Paired nominal data test
| Column 1 | Column 2 | |
|---|---|---|
| Row 1 | ||
| Row 2 |
Enter your data and click Calculate
or click "Load Example" to try it out