対応のないt検定のノンパラメトリック代替法。正規分布を仮定せずに2つの独立群を比較します。
The Mann-Whitney U test (also known as the Wilcoxon rank-sum test) is a non-parametric statistical test used to compare the distributions of two independent groups. Unlike the independent samples t-test, the Mann-Whitney U test does not assume that the data are normally distributed, making it ideal for ordinal data, skewed distributions, or small sample sizes where normality cannot be verified. It was developed by Henry B. Mann and Donald R. Whitney in 1947 and is one of the most widely used non-parametric tests in behavioral science, medicine, and social research.
The Mann-Whitney U test is the non-parametric alternative to the independent samples t-test. Use it when one or more of the following conditions apply: your data are measured on an ordinal scale (e.g., Likert-type items), the assumption of normality is violated, your sample sizes are very small (e.g., n < 15 per group), or your data contain outliers that would distort parametric results. It is especially common in clinical trials, quality-of-life research, and educational studies where rating scales are used.
| Feature | Mann-Whitney U | Independent T-Test |
|---|---|---|
| Type | Non-parametric | Parametric |
| Data level | Ordinal or continuous | Continuous (interval/ratio) |
| Normality required | No | Yes (or large n) |
| Compares | Rank distributions | Means |
| Effect size | Rank-biserial r | Cohen's d |
| Robustness to outliers | High | Low |
A researcher compares pain ratings (1-10 scale) between patients receiving a new treatment (Group 1) and a placebo (Group 2). Since pain ratings are ordinal and the sample is small, a Mann-Whitney U test is appropriate.
Group 1 — Treatment (n=8)
85, 72, 91, 68, 77, 95, 83, 89
Mdn = 84.0
Group 2 — Placebo (n=8)
65, 78, 71, 62, 73, 69, 75, 67
Mdn = 70.0
Results
U = 5.0, z = -2.84, p = .005, r = 0.84
The treatment group had significantly higher scores than the placebo group, with a large effect size (rank-biserial r = 0.84).
While the Mann-Whitney U test is less restrictive than the t-test, it still has assumptions that should be verified:
1. Ordinal or Continuous Data
The dependent variable must be measured on at least an ordinal scale (i.e., values can be meaningfully ranked). This includes Likert scales, test scores, reaction times, and any continuous measure.
2. Independent Groups
The two groups must be independent of each other. Each observation belongs to only one group, and participants in one group do not influence participants in the other. For paired/matched data, use the Wilcoxon signed-rank test instead.
3. Independent Observations
Observations within each group must be independent. Repeated measures or clustered data violate this assumption and require different analytical approaches.
4. Similar Distribution Shape (for median comparison)
If you want to interpret the test as comparing medians, the two groups should have similarly shaped distributions (differing only in location). Without this assumption, the test compares the overall rank distributions rather than medians specifically.
The rank-biserial correlation (r) is the recommended effect size measure for the Mann-Whitney U test. It ranges from -1 to +1 and represents the proportion of favorable comparisons minus unfavorable comparisons between the two groups.
| |r| | Interpretation | Practical Meaning |
|---|---|---|
| < 0.1 | Negligible | Groups are nearly identical in rank |
| 0.1 - 0.3 | Small | Slight tendency for one group to rank higher |
| 0.3 - 0.5 | Medium | Noticeable separation between groups |
| > 0.5 | Large | Strong separation, most of one group outranks the other |
According to APA 7th edition guidelines, report the U statistic, z value, p-value, effect size, and descriptive statistics (medians and sample sizes) for each group:
Example Report
A Mann-Whitney U test indicated that scores for the treatment group (Mdn = 84.0, n = 8) were significantly higher than for the placebo group (Mdn = 70.0, n = 8), U = 5.0, z = -2.84, p = .005, r = .84.
Note: Report U to one decimal place, z to two decimal places, and p to three decimal places. Use p < .001 when the value is below .001. Always include the rank-biserial r as the effect size measure.
StatMate's Mann-Whitney U calculations have been validated against R (wilcox.test function) and SPSS output. The implementation uses the normal approximation with continuity correction for the z-score and the jstat library for probability distributions. Tied ranks are handled using the average rank method. All results match R output to at least 4 decimal places.
t検定
2群の平均値を比較
分散分析
3群以上の平均値を比較
カイ二乗検定
カテゴリ変数の関連を検定
相関分析
関係の強さを測定
記述統計
データを要約
サンプルサイズ
検出力分析・標本計画
1標本t検定
既知の値との比較
ウィルコクソン検定
ノンパラメトリック対応検定
回帰分析
X-Yの関係をモデル化
重回帰分析
複数の予測変数
クロンバックのα
尺度の信頼性
ロジスティック回帰
二値アウトカムの予測
因子分析
潜在因子構造の探索
クラスカル・ウォリス
ノンパラメトリック3群以上比較
反復測定
被験者内分散分析
二元配置分散分析
要因計画の分析
フリードマン検定
ノンパラメトリック反復測定
フィッシャーの正確検定
2×2表の正確検定
マクネマー検定
対応のある名義データの検定
Excel/スプレッドシートから貼り付け、またはCSVファイルをドロップ
Excel/スプレッドシートから貼り付け、またはCSVファイルをドロップ
データを入力して「計算」をクリックしてください
または「サンプルデータを読み込む」をクリックしてお試しください