What is the correct APA format for reporting a Kruskal-Wallis test?

The standard format is: H(df) = X.XX, p = .XXX, epsilon-squared = .XX. For example: H(2) = 18.42, p < .001, epsilon-squared = .31. Include the test justification, descriptive statistics with medians and IQRs, and post-hoc pairwise comparisons when the omnibus test is significant.

What descriptive statistics should I report alongside Kruskal-Wallis results?

Report medians (Mdn) and interquartile ranges (IQR) as your primary descriptive statistics. You may also report mean ranks if they help clarify the group ordering. Avoid reporting means and standard deviations as the sole descriptive statistics because they assume a symmetric distribution, which contradicts your rationale for choosing a nonparametric test.

How do I calculate epsilon squared for a Kruskal-Wallis test?

Use the formula epsilon-squared = H / (N - 1), where H is the test statistic and N is the total sample size. Interpret the result using these benchmarks: .01 = small effect, .06 = medium effect, .14 = large effect. For example, with H = 18.42 and N = 60, epsilon-squared = 18.42 / 59 = .31 (large effect).

When should I use Dunn's test vs pairwise Mann-Whitney U tests?

Dunn's test is preferred because it uses the same ranking from the original Kruskal-Wallis omnibus test, maintaining statistical consistency. Pairwise Mann-Whitney U tests re-rank the data for each comparison, which can lead to different rank orderings. Both require a correction for multiple comparisons (Bonferroni or Holm), but Dunn's test is the more methodologically sound choice.

What is the difference between Kruskal-Wallis and Friedman tests?

The Kruskal-Wallis test compares three or more independent groups (different participants in each group), while the Friedman test compares three or more related groups (the same participants measured under all conditions). The Kruskal-Wallis test is the nonparametric equivalent of the one-way ANOVA; the Friedman test is the nonparametric equivalent of repeated measures ANOVA.

Kruskal-Wallis検定をAPA第7版で報告する方法 — 効果量・事後検定・報告例

Q: What descriptive statistics should I report alongside Kruskal-Wallis results?

Report medians (Mdn) and interquartile ranges (IQR) as your primary descriptive statistics. You may also report mean ranks if they help clarify the group ordering. Avoid reporting means and standard deviations as the sole descriptive statistics because they assume a symmetric distribution, which contradicts your rationale for choosing a nonparametric test.

Q: How do I calculate epsilon squared for a Kruskal-Wallis test?

Use the formula epsilon-squared = H / (N - 1), where H is the test statistic and N is the total sample size. Interpret the result using these benchmarks: .01 = small effect, .06 = medium effect, .14 = large effect. For example, with H = 18.42 and N = 60, epsilon-squared = 18.42 / 59 = .31 (large effect).

Q: When should I use Dunn's test vs pairwise Mann-Whitney U tests?

Dunn's test is preferred because it uses the same ranking from the original Kruskal-Wallis omnibus test, maintaining statistical consistency. Pairwise Mann-Whitney U tests re-rank the data for each comparison, which can lead to different rank orderings. Both require a correction for multiple comparisons (Bonferroni or Holm), but Dunn's test is the more methodologically sound choice.

Q: What is the difference between Kruskal-Wallis and Friedman tests?

The Kruskal-Wallis test compares three or more independent groups (different participants in each group), while the Friedman test compares three or more related groups (the same participants measured under all conditions). The Kruskal-Wallis test is the nonparametric equivalent of the one-way ANOVA; the Friedman test is the nonparametric equivalent of repeated measures ANOVA.

Kruskal-Wallis検定が重要な理由

Kruskal-Wallis H検定は、社会科学および健康科学で最も広く使用されるノンパラメトリック手法の一つです。一元配置分散分析の順位ベースの代替法として、正規性の仮定を要求することなく、3つ以上の独立群における連続変数または順序変数の分布を比較することができます。

これが重要な理由は、現実のデータがパラメトリック検定の要求する仮定に頻繁に違反するためです。0-10の疼痛評定はしばしば歪んでいます。リッカート型調査回答は本質的に順序データです。小標本のパイロット研究での臨床アウトカム指標は、教科書的な正規分布を産出することが稀です。

その人気にもかかわらず、多くの研究者は報告面で苦労しています。APA第7版は、Kruskal-Wallis検定の報告方法について明確な期待を持っています：自由度付きの H 統計量、正確な p 値、適切な効果量、そしてオムニバス検定が有意な場合には多重検定の修正を伴う事後対ごと比較です。本ガイドでは、正当化から事後分析まで、完全なKruskal-Wallis報告のすべての要素を、データに適応可能なコピー＆ペーストのAPAテンプレートとともに解説します。

Kruskal-Wallis検定と一元配置分散分析の使い分け

3つ以上の群における非正規分布

一元配置分散分析は、各群内の従属変数がほぼ正規分布していることを仮定します。1つ以上の群でShapiro-Wilk検定が p < .05を示す場合、またはQ-Qプロットが正規性からの実質的な逸脱を示す場合、Kruskal-Wallis検定が適切な代替法です。

順序従属変数

結果変数がリッカート項目、疼痛重症度評定、満足度カテゴリなどの順序尺度で測定されている場合、分布の形状に関係なくKruskal-Wallis検定が正しい選択です。

不等分散と外れ値

分布がおおよそ正規であっても、群間の分散が著しく不等な場合（Leveneの検定 p < .05）や、極端な外れ値が存在する場合、Kruskal-Wallis検定は生のスコアを順位に変換するため、極端な値の影響を圧縮し、両方の問題に対して抵抗力があります。

判断フローチャート

3つ以上の独立群ですか？ いいえの場合、Mann-Whitney U検定（2群）またはWilcoxon符号順位検定（対応データ）を使用。
順序従属変数ですか？ はい → Kruskal-Wallis。
すべての群で正規性の仮定が満たされていますか？ Shapiro-Wilkで検定。いずれかの群で違反 → Kruskal-Wallis。
分散の等質性が満たされていますか？ Leveneの検定で検定。違反の場合 → Kruskal-Wallis（またはパラメトリック代替としてWelchの分散分析）。
重度の外れ値がありますか？ はい → Kruskal-Wallis。
すべての仮定が満たされていますか？ → より高い統計的検出力のために一元配置分散分析を使用。

Kruskal-Wallisの基本APA形式

Kruskal-Wallis結果を報告するための標準的なAPA第7版テンプレートは以下の通りです：

H(df) = X.XX, p = .XXX, ε² = .XX

各要素の意味：

H：Kruskal-Wallis検定統計量、近似カイ二乗分布に従う
df：自由度、群数マイナス1に等しい（k - 1）
p：正確な p 値、小数点以下3桁；.001未満の場合は p < .001
ε²：イプシロン二乗、Kruskal-Wallis検定の最も一般的な効果量

Kruskal-Wallisの報告：ステップバイステップ

研究シナリオ

疼痛評定（0-10数値評価尺度）を3つの処置群間で比較する臨床研究を想像してください：プラセボ（n = 20）、薬剤A（n = 20）、薬剤B（n = 20）。研究者は、3群中2群で疼痛評定が正規性の仮定に違反していたため（Shapiro-Wilk p = .008および p = .021）、Kruskal-Wallis検定を選択しました。

ステップ1：中央値とIQRによる記述統計量の報告

| 群 | n | Mdn | IQR | |----|-----|-------|-----| | プラセボ | 20 | 7.00 | 5.25-8.00 | | 薬剤A | 20 | 5.00 | 3.00-6.75 | | 薬剤B | 20 | 3.50 | 2.00-5.00 |

疼痛評定の中央値はプラセボ群で最も高く（Mdn = 7.00, IQR = 5.25-8.00）、薬剤A（Mdn = 5.00, IQR = 3.00-6.75）、薬剤B（Mdn = 3.50, IQR = 2.00-5.00）と続いた。

ステップ2：オムニバスH検定の報告（有意な結果）

Kruskal-Wallis H検定を実施し、3つの処置群間の疼痛評定を比較した。プラセボ群と薬剤B群で疼痛評定が正規性の仮定に違反していたため（Shapiro-Wilk p = .008および p = .021）、ノンパラメトリック検定を選択した。検定は群間の疼痛評定に統計的に有意な差を示した, H(2) = 18.42, p < .001, ε² = .31.

ステップ3：非有意な結果の報告

Kruskal-Wallis H検定は、3つの処置群間の不安スコアに統計的に有意な差を示さなかった, H(2) = 3.17, p = .205, ε² = .05.

効果量：イプシロン二乗

算出方法

イプシロン二乗の式は：

ε² = H / (N - 1)

ここで H はKruskal-Wallis検定統計量、N はすべての群を合わせた総サンプルサイズです。

計算例： H = 18.42、N = 60（3群各20名）の場合：

ε² = 18.42 / (60 - 1) = 18.42 / 59 = .31

解釈の基準

| ε² | 解釈 | |------|------| | .01 | 小さい効果 | | .06 | 中程度の効果 | | .14 | 大きい効果 |

本例では、ε² = .31は大きい効果であり、処置群の所属が疼痛評定の順位の変動性の約31%を説明していることを示しています。

事後検定：Bonferroni修正によるDunnの検定

有意なKruskal-Wallis結果は、少なくとも1つの群が少なくとも1つの他の群と異なることを示します。3つ以上の群がある場合、対ごとの事後比較を実施しなければなりません。

Dunnの検定を実施するタイミング

事後比較は、オムニバスKruskal-Wallis検定が有意（p < .05）な場合にのみ適切です。

Dunnの検定

Dunnの検定は、Kruskal-Wallis検定の標準的な事後手続きです。元のオムニバス順位付けの順位和を使用して、すべての可能な群のペアを比較します。

k 群の場合、対ごとの比較回数は k(k - 1) / 2です。3群では3つの比較となります。

Bonferroni修正

3つの比較とアルファ = .05の場合、調整された閾値は .05 / 3 = .017です。

対ごと比較のAPA形式

Bonferroni修正によるDunnの事後対ごと比較により、薬剤Bはプラセボ群よりも有意に低い疼痛評定を示し（z = -4.12, p < .001）、薬剤Aよりも有意に低かった（z = -2.54, p = .033）。薬剤Aとプラセボ群の差も有意であった（z = -2.08, p = .038, 調整済み）。

完全なAPA段落

Kruskal-Wallis H検定を実施し、3つの処置条件における疼痛評定（0-10尺度）の差を検討した：プラセボ（n = 20）、薬剤A（n = 20）、薬剤B（n = 20）。2つの群で疼痛評定が正規性の仮定に違反していたため（Shapiro-Wilk p = .008および p = .021）、ノンパラメトリック検定を選択した。疼痛評定の中央値は、プラセボ群で7.00（IQR = 5.25-8.00）、薬剤Aで5.00（IQR = 3.00-6.75）、薬剤Bで3.50（IQR = 2.00-5.00）であった。Kruskal-Wallis検定は群間の疼痛評定に統計的に有意な差を示した, H(2) = 18.42, p < .001, ε² = .31。Bonferroni修正によるDunnの事後対ごと比較により、薬剤Bとプラセボ（z = -4.12, p < .001）、薬剤Bと薬剤A（z = -2.54, p = .033）、薬剤Aとプラセボ（z = -2.08, p = .038）の間に有意な差が認められた。

よくある間違いと回避方法

1. 中央値ではなく平均値を報告する

Kruskal-Wallis報告で最も多いエラーです。ノンパラメトリック検定を選択したのは分布が非正規だからです。平均値と標準偏差を主要な記述統計量として報告することはその根拠と直接矛盾します。

2. Kruskal-Wallisの代わりに複数のMann-Whitney検定を実施する

3群の場合に、一部の研究者はオムニバス検定を省略し、3つの個別のMann-Whitney U検定を実施します。これは方法論的に不正確であり、族ごとの第I種の誤り率を膨張させます。

3. 事後検定でBonferroni修正を忘れる

修正なしでは、各比較は完全なアルファ = .05を使用し、族ごとの誤り率を膨張させます。常に修正方法を記載し、調整済みp値を報告してください。

4. 効果量を報告しない

APA第7版はすべての推測統計検定に効果量を要求しています。p値だけでは差が実践的に意味があるかどうかを読者に伝えません。

5. ノンパラメトリック検定を使用する正当化の省略

弱い： 「Kruskal-Wallis検定を実施した。」

強い： 「3群中2群で疼痛評定が正規分布していなかったため（Shapiro-Wilk p = .008および p = .021）、Kruskal-Wallis H検定を使用した。」

Kruskal-Wallis検定とFriedman検定の比較

| 特徴 | Kruskal-Wallis | Friedman | |------|---------------|----------| | デザイン | 独立群（被験者間） | 反復測定（被験者内） | | 参加者 | 各群で異なる参加者 | 同一の参加者がすべての条件で測定 | | パラメトリック等価法 | 一元配置分散分析 | 反復測定分散分析 | | 順位付け方法 | すべての観測を一括で順位付け | 各参加者内で個別に順位付け | | 事後手続き | Dunnの検定 | Nemenyi検定またはConover検定 | | 効果量 | イプシロン二乗（ε²） | Kendallの W |

よくある質問

Kruskal-Wallis検定を報告する正しいAPA形式は何ですか？

標準形式は：H(df) = X.XX, p = .XXX, ε² = .XX です。例：H(2) = 18.42, p < .001, ε² = .31。検定の正当化、中央値とIQRによる記述統計量、オムニバス検定が有意な場合の事後対ごと比較を含めてください。

Kruskal-Wallis結果とともにどの記述統計量を報告すべきですか？

主要な記述統計量として中央値（Mdn）と四分位範囲（IQR）を報告してください。平均値と標準偏差を唯一の記述統計量として報告することは避けてください。

Kruskal-Wallis検定のイプシロン二乗をどう計算しますか？

ε² = H / (N - 1) の式を使用します。基準：.01 = 小さい効果、.06 = 中程度の効果、.14 = 大きい効果。

Dunnの検定と対ごとのMann-Whitney U検定のどちらを使用すべきですか？

Dunnの検定が推奨されます。元のKruskal-Wallisオムニバス検定と同じ順位付けを使用し、統計的一貫性を維持するためです。対ごとのMann-Whitney U検定は各比較ごとにデータを再順位付けするため、異なる順位序列につながる可能性があります。

計算の正確性

StatMateの無料Kruskal-Wallis計算ツールはプロセス全体を自動化します：

群データを入力すると H 統計量、正確なp値、イプシロン二乗が即座に得られます
オムニバス検定が有意な場合、Bonferroni修正付きDunnの事後検定が自動実行
ワンクリックでコピー可能なAPA形式の結果段落
完全な分析の無料PDFエクスポート（効果量を含む）
各群の分布を比較する視覚的ボックスプロット

Kruskal-Wallis計算ツールを試す