Non-parametric alternative to repeated measures ANOVA. Compare ranks across three or more related conditions measured on the same subjects.
The Friedman test is a non-parametric statistical test used to detect differences across three or more related groups (repeated measures). It is the non-parametric alternative to repeated measures ANOVA. Developed by Milton Friedman in 1937, the test ranks observations within each subject across conditions and tests whether the mean ranks differ significantly among conditions. It is widely used in medicine, psychology, and education for pre-post-follow-up designs and within- subject experiments.
Use the Friedman test when you have a repeated measures or matched design with three or more conditions and one or more of the following apply: your data are measured on an ordinal scale, the assumption of normality is violated, your sample sizes are small, or your data contain outliers. Common applications include comparing treatment effects over time, evaluating product preferences from the same judges, and analyzing questionnaire responses measured at multiple time points.
| Feature | Friedman Test | RM ANOVA |
|---|---|---|
| Type | Non-parametric | Parametric |
| Data level | Ordinal or continuous | Continuous (interval/ratio) |
| Normality required | No | Yes (or large n) |
| Design | Repeated measures / matched | Repeated measures / matched |
| Effect size | Kendall's W | Partial η² |
| Post-hoc test | Nemenyi / Bonferroni | Bonferroni pairwise |
A researcher measures pain levels of 10 patients at three time points: before treatment, 1 week after, and 4 weeks after. Since pain ratings are ordinal and the design is repeated measures, a Friedman test is appropriate.
Baseline (n=10)
72, 85, 91, 68, 77, 83, 95, 88, 74, 79
Mdn = 80.5
1 Week (n=10)
78, 89, 95, 73, 82, 87, 98, 92, 79, 83
Mdn = 85.0
4 Weeks (n=10)
82, 93, 99, 78, 86, 91, 102, 96, 84, 88
Mdn = 89.5
Results
χ²(2) = 20.00, p < .001, W = 1.00
There was a significant difference across time points, with a large effect size. Post-hoc comparisons revealed significant improvement from baseline to both follow-up time points.
While the Friedman test is less restrictive than repeated measures ANOVA, it still has assumptions:
1. Ordinal or Continuous Data
The dependent variable must be measured on at least an ordinal scale so that values can be meaningfully ranked within each subject.
2. Related Groups (Repeated Measures)
The same subjects must be measured under all conditions. For independent groups, use the Kruskal-Wallis H test instead.
3. Equal Sample Sizes
Each condition must have the same number of observations since each subject provides one observation per condition.
4. Random Sample
Subjects should be randomly selected from the population of interest. Non-random selection may limit the generalizability of results.
Kendall's W (coefficient of concordance) is the effect size for the Friedman test. It ranges from 0 to 1, where 0 indicates no agreement in rankings and 1 indicates complete agreement.
| W | Interpretation | Practical Meaning |
|---|---|---|
| < 0.1 | Negligible | Conditions are nearly identical |
| 0.1 - 0.3 | Small | Slight consistent difference across conditions |
| 0.3 - 0.5 | Medium | Noticeable and consistent pattern |
| > 0.5 | Large | Strong consistent difference across conditions |
According to APA 7th edition guidelines, report the chi-square statistic, degrees of freedom, p-value, and Kendall's W:
Example Report
A Friedman test indicated a statistically significant difference in pain levels across the three time points, χ²(2) = 20.00, p < .001, W = 1.00. Post-hoc pairwise comparisons with Bonferroni correction revealed significant improvement from baseline (Mdn = 80.5) to both 1 week (Mdn = 85.0) and 4 weeks (Mdn = 89.5).
Note: Report χ² to two decimal places, degrees of freedom as an integer, and p to three decimal places. Use p < .001 when the value is below .001. Always include Kendall's W as the effect size measure.
StatMate's Friedman test calculations have been validated against R (friedman.test function) and SPSS output. The implementation uses chi-square approximation for the p-value and the jstat library for probability distributions. Tied ranks within subjects are handled using the average rank method. All results match R output to at least 4 decimal places.
T-Test
Compare means between two groups
ANOVA
Compare means across 3+ groups
Chi-Square
Test categorical associations
Correlation
Measure relationship strength
Descriptive
Summarize your data
Sample Size
Power analysis & sample planning
One-Sample T
Test against a known value
Mann-Whitney U
Non-parametric group comparison
Wilcoxon
Non-parametric paired test
Regression
Model X-Y relationships
Multiple Regression
Multiple predictors
Cronbach's Alpha
Scale reliability
Logistic Regression
Binary outcome prediction
Factor Analysis
Explore latent factor structure
Kruskal-Wallis
Non-parametric 3+ group comparison
Repeated Measures
Within-subjects ANOVA
Two-Way ANOVA
Factorial design analysis
Fisher's Exact
Exact test for 2×2 tables
McNemar Test
Paired nominal data test
Paste from Excel/Sheets or drop a CSV file
Paste from Excel/Sheets or drop a CSV file
Paste from Excel/Sheets or drop a CSV file
friedman.pairedNote
Enter your data and click Calculate
or click "Load Example" to try it out