When should I use a chi-square test?

Use it to analyze relationships between categorical variables. The test of independence checks if two categorical variables are associated, while the goodness-of-fit test checks if observed frequencies match expected frequencies.

What if expected frequencies are below 5?

If more than 20% of cells have expected frequencies below 5, the chi-square test may be unreliable. Consider combining categories or using Fisher's exact test.

How do I interpret Cramér's V?

Cramér's V = 0.1 indicates weak association, 0.3 is moderate, and 0.5+ is strong. It's the effect size measure for chi-square tests.

What's the difference between chi-square and t-test?

T-tests compare means of continuous variables, while chi-square tests analyze frequencies of categorical variables. Choose based on your data type.

カイ二乗検定計算ツール

分割表の独立性の検定と適合度検定。結果はAPA第7版形式で表示されます。

What is a Chi-Square Test?

The chi-square (χ²) test is a non-parametric statistical test used to examine relationships between categorical variables. Unlike t-tests or ANOVA that compare means, the chi-square test works with frequency counts—how many observations fall into each category. It compares the frequencies you actually observed in your data to the frequencies you would expect if there were no relationship between the variables. When the difference between observed and expected frequencies is large enough, you can conclude the variables are significantly associated.

Chi-Square Test of Independence

Use the test of independence to determine whether two categorical variables are related. The data are arranged in a contingency table (cross-tabulation) where rows represent one variable and columns represent the other. For example, you might test whether there is a relationship between gender and product preference, or between treatment condition and recovery outcome. The null hypothesis states that the two variables are independent—knowing the value of one variable tells you nothing about the other.

Chi-Square Goodness-of-Fit Test

Use the goodness-of-fit test to determine whether observed frequencies for a single categorical variable differ from a set of expected frequencies. For example, testing if a die is fair by comparing observed rolls to the expected equal distribution (1/6 for each face), or testing whether customer visits are evenly distributed across days of the week. The null hypothesis states that the observed distribution matches the expected distribution.

Worked Example: Test of Independence

A researcher surveyed 100 people to test whether gender (Male / Female) is associated with product preference (A / B / C). The observed frequencies are:

Observed	Product A	Product B	Product C	Row Total
Male	30	10	10	50
Female	15	20	15	50
Column Total	45	30	25	100

Expected frequencies are calculated as (Row Total × Column Total) / Grand Total. For example, the expected frequency for Male × Product A = (50 × 45) / 100 = 22.5.

Expected	Product A	Product B	Product C
Male	22.5	15.0	12.5
Female	22.5	15.0	12.5

Results

χ²(2, N = 100) = 8.74, p = .013, Cramer's V = .30

There was a statistically significant association between gender and product preference, χ²(2, N = 100) = 8.74, p = .013, with a medium effect size (Cramer's V = .30). Males showed a stronger preference for Product A, while females were more evenly distributed across products.

When to Use Chi-Square vs. Other Tests

Choosing the right test depends on the type of data you have and the size of your sample. Use this guide to select the appropriate test:

Situation	Recommended Test
Two categorical variables (2×2 or larger table)	Chi-square test of independence
One categorical variable vs. expected proportions	Chi-square goodness-of-fit test
2×2 table with any expected frequency < 5	Fisher's exact test
Ordinal data, two independent groups	Mann-Whitney U test
Paired or matched categorical data	McNemar's test
More than two related categorical samples	Cochran's Q test

Assumptions of the Chi-Square Test

Before interpreting your chi-square results, verify that these assumptions are met:

1. Categorical Data

Both variables must be categorical (nominal or ordinal). The chi-square test does not work with continuous data. If you have continuous measurements, you must first categorize them into groups (e.g., age → age ranges), though this results in a loss of information.

2. Independent Observations

Each observation must be independent of all others. This means each participant or case contributes to only one cell in the contingency table. Repeated measures or matched pairs violate this assumption—use McNemar's test instead.

3. Expected Frequency ≥ 5

All expected cell frequencies should be 5 or greater. When more than 20% of cells have expected frequencies below 5, the chi-square approximation becomes unreliable. In such cases, consider combining categories or using Fisher's exact test (for 2×2 tables).

4. Random Sampling

Data should be collected through random sampling or random assignment to ensure the sample is representative of the population. Convenience or biased samples can lead to misleading results regardless of what the test shows.

Understanding Cramer's V Effect Size

While the p-value tells you whether an association is statistically significant, Cramer's V tells you how strong the association is. This is critical because with large sample sizes, even trivial associations can reach statistical significance. Cramer's V ranges from 0 (no association) to 1 (perfect association), and its interpretation depends on the degrees of freedom (the smaller of rows − 1 or columns − 1):

Effect Size	df* = 1	df* = 2	df* = 3	df* ≥ 4
Small	.10	.07	.06	.05
Medium	.30	.21	.17	.15
Large	.50	.35	.29	.25

*df* = min(rows − 1, columns − 1). For our worked example above (2×3 table), df* = 1, so V = .30 represents a medium effect.

How to Report Chi-Square Results in APA Format

According to APA 7th edition guidelines, chi-square results should include the chi-square statistic, degrees of freedom, sample size, p-value, and an effect size measure. Here is a template and a real example:

Template

A chi-square test of independence was conducted to examine the relationship between [Variable 1] and [Variable 2]. The relation between these variables was [significant/not significant], χ²(df, N = XX) = X.XX, p = .XXX, Cramer's V = .XX.

Real Example (from the worked example above)

A chi-square test of independence was conducted to examine the relationship between gender and product preference. The relation between these variables was significant, χ²(2, N = 100) = 8.74, p = .013, Cramer's V = .30. Males showed a notably higher preference for Product A (60%) compared to females (30%), while females were more evenly distributed across all three products.

Note: Report χ² values to two decimal places. Report p-values to three decimal places, except use p < .001 when the value is below .001. Always include an effect size measure (Cramer's V for independence tests).

Common Mistakes to Avoid

Using chi-square with small expected frequencies: When expected cell counts fall below 5, the chi-square approximation is unreliable. Use Fisher's exact test for 2×2 tables, or combine categories to increase expected counts in larger tables.
Entering percentages instead of raw counts: The chi-square test requires actual frequency counts, not percentages or proportions. Using percentages will produce incorrect results because the test needs to know the actual sample size.
Ignoring effect size: A statistically significant chi-square result with a tiny Cramer's V (e.g., .05) may not be practically meaningful. With large samples, even trivial associations become "significant." Always report and interpret Cramer's V.
Violating independence of observations: Each participant should contribute to only one cell. If the same person appears in multiple cells (e.g., repeated measures), the chi-square test is invalid. Use McNemar's test for paired data.
Confusing the two types of chi-square tests: The test of independence (two variables in a contingency table) and the goodness-of-fit test (one variable vs. expected proportions) answer different questions. Make sure you select the correct test for your research question.

Calculation Accuracy

StatMate's chi-square calculations have been validated against R's chisq.test() function and SPSS output. We use the jstat library for chi-square probability distributions and compute expected frequencies, degrees of freedom, and Cramer's V following standard statistical formulas. All results match R output to at least 4 decimal places.

他の計算ツールを試す

t検定

2群の平均値を比較

分散分析

3群以上の平均値を比較

相関分析

関係の強さを測定

記述統計

データを要約

サンプルサイズ

検出力分析・標本計画

1標本t検定

既知の値との比較

マン・ホイットニーU

ノンパラメトリック群間比較

ウィルコクソン検定

ノンパラメトリック対応検定

回帰分析

X-Yの関係をモデル化

重回帰分析

複数の予測変数

クロンバックのα

尺度の信頼性

ロジスティック回帰

二値アウトカムの予測

因子分析

潜在因子構造の探索

クラスカル・ウォリス

ノンパラメトリック3群以上比較

反復測定

被験者内分散分析

二元配置分散分析

要因計画の分析

フリードマン検定

ノンパラメトリック反復測定

フィッシャーの正確検定

2×2表の正確検定

マクネマー検定

対応のある名義データの検定

	列1	列2
行1
行2

	列1	列2
行1
行2