平均、中央値、最頻値、標準偏差、歪度、尖度、四分位数などを計算します。結果はAPA第7版形式で表示されます。
Descriptive statistics summarize and organize the characteristics of a dataset, providing simple yet powerful summaries about the sample and its measures. They form the foundation of virtually every quantitative analysis in social science, psychology, medicine, education, and business research. Before running any inferential test such as a t-test, ANOVA, or regression, researchers must first describe their data to understand its central tendency, variability, and distributional shape.
Descriptive statistics serve three critical purposes in research: (1) they help detect data entry errors and outliers before analysis, (2) they verify whether the assumptions required by inferential tests are met (e.g., normality), and (3) they communicate the basic properties of your data to readers. The APA Publication Manual (7th edition) requires researchers to report descriptive statistics for all primary study variables, making them an indispensable part of any results section.
A professor collected final exam scores from 20 students in an introductory psychology course. The goal is to describe the distribution of scores before comparing them with another section.
Raw Data (n = 20)
62, 65, 68, 70, 72, 73, 75, 76, 77, 78, 78, 79, 80, 81, 82, 83, 85, 88, 90, 92
Central Tendency
M = 76.50
Mdn = 77.00
Mode = 78
Variability
SD = 8.23
Variance = 67.74
Range = 30 (62–92)
IQR = 11.25
Distribution Shape
Skewness = −0.34
Kurtosis = −0.67
Approximately normal distribution with a slight negative skew
95% Confidence Interval for the Mean
95% CI [72.65, 80.35]
We are 95% confident that the true population mean exam score falls between 72.65 and 80.35.
Central tendency describes the "typical" value in a dataset. The three main measures each have distinct strengths, and choosing the right one depends on your data's distribution and measurement scale.
| Measure | Definition | Best Used When |
|---|---|---|
| Mean (M) | Sum of all values divided by n | Data are approximately symmetric (normal) with no extreme outliers |
| Median (Mdn) | Middle value when data are sorted | Data are skewed or contain outliers (e.g., income, reaction time) |
| Mode | Most frequently occurring value | Nominal or categorical data, or to identify peaks in a distribution |
Guidance for Skewed Data
When data are positively skewed (right tail), the mean is pulled higher than the median — report the median as your primary measure. When data are negatively skewed (left tail), the mean is pulled lower than the median. A practical rule: if the mean and median differ by more than 10% of the standard deviation, consider reporting the median instead of the mean, and pair it with the IQR rather than the SD.
Variability (or dispersion) describes how spread out the data points are around the center. Two datasets can have the same mean but vastly different variability, so reporting spread is just as important as reporting the center.
Standard Deviation (SD)
The average distance of each data point from the mean, expressed in the original units of measurement. An SD of 8.23 points on an exam means scores typically fall about 8 points above or below the mean. This is the most commonly reported measure of spread in APA-style research.
Variance (SD²)
The squared standard deviation. While variance is essential in calculations (e.g., ANOVA partitions variance), it is harder to interpret directly because its units are squared. A variance of 67.74 means little on its own, but its square root (SD = 8.23) is meaningful.
Range
The difference between the maximum and minimum values (92 − 62 = 30). The range is easy to compute but highly sensitive to outliers — a single extreme value can dramatically inflate it.
Interquartile Range (IQR)
The range of the middle 50% of values (Q3 − Q1). The IQR is robust to outliers and is the preferred spread measure when reporting the median. In our example, IQR = 11.25, meaning the central half of exam scores spans about 11 points.
Skewness and kurtosis quantify the shape of a distribution and are critical for checking the normality assumption required by many parametric tests (t-tests, ANOVA, regression). Understanding these measures helps you decide whether to use parametric or non-parametric methods.
| Measure | Value | Interpretation |
|---|---|---|
| Skewness | ≈ 0 | Symmetric distribution (normal) |
| > 0 (positive) | Right tail is longer; most values cluster on the left (e.g., income data) | |
| < 0 (negative) | Left tail is longer; most values cluster on the right (e.g., easy exam scores) | |
| Kurtosis (excess) | ≈ 0 | Mesokurtic — tails similar to a normal distribution |
| > 0 (positive) | Leptokurtic — heavier tails, more outliers than normal | |
| < 0 (negative) | Platykurtic — lighter tails, fewer outliers than normal |
Normality Rule of Thumb
Skewness and kurtosis values between −2 and +2 are generally considered acceptable for assuming normality (George & Mallery, 2019). Some stricter criteria use −1 to +1. In our example, skewness = −0.34 and kurtosis = −0.67, both well within the acceptable range, confirming the distribution is approximately normal.
The 95% confidence interval (CI) for the mean provides a range of plausible values for the true population mean. In our example, the 95% CI [72.65, 80.35] means that if we were to repeat this study many times and compute a CI each time, approximately 95% of those intervals would contain the true population mean.
What the CI does mean
We have 95% confidence that the procedure used to construct this interval captures the true population mean. The width of the interval (80.35 − 72.65 = 7.70) reflects the precision of our estimate — narrower intervals indicate more precise estimates.
What the CI does not mean
It does not mean there is a 95% probability that the population mean lies within this specific interval. The population mean is a fixed value — it either is or is not in this interval. The 95% refers to the long-run frequency of the method, not the probability for any single interval.
The CI width depends on three factors: sample size (larger n = narrower CI), variability (smaller SD = narrower CI), and confidence level (99% CI is wider than 95% CI). To halve the width, you need to quadruple the sample size.
The APA 7th edition requires descriptive statistics for all primary variables, typically presented in a table or in-text. Here are templates using the worked example above:
In-Text Reporting (Normal Distribution)
Exam scores were approximately normally distributed (skewness = −0.34, kurtosis = −0.67). Students scored an average of 76.50 points (SD = 8.23), with a 95% CI [72.65, 80.35].
In-Text Reporting (Skewed Distribution)
Response times were positively skewed (skewness = 1.42); therefore, the median is reported. The median response time was 340 ms (Mdn = 340, IQR = 120).
APA Table Format Template
| Variable | n | M | SD | Mdn | Skewness | Kurtosis |
|---|---|---|---|---|---|---|
| Exam scores | 20 | 76.50 | 8.23 | 77.00 | −0.34 | −0.67 |
Note: Report all descriptive statistics to two decimal places. Use italics for statistical symbols (M, SD, Mdn). When data are non-normal, report median and IQR instead of mean and SD. Always report sample size (n or N) alongside descriptive statistics.
StatMate's descriptive statistics calculations have been validated against R's psych::describe() function and SPSS Descriptives output. All measures — including mean, SD, skewness (type 2 / sample), kurtosis (excess, type 2), quartiles, and confidence intervals — match R and SPSS output to at least 4 decimal places. The calculator uses the sample standard deviation formula (dividing by n − 1) and the adjusted Fisher-Pearson coefficient for skewness and kurtosis, consistent with standard statistical software defaults.
t検定
2群の平均値を比較
分散分析
3群以上の平均値を比較
カイ二乗検定
カテゴリ変数の関連を検定
相関分析
関係の強さを測定
サンプルサイズ
検出力分析・標本計画
1標本t検定
既知の値との比較
マン・ホイットニーU
ノンパラメトリック群間比較
ウィルコクソン検定
ノンパラメトリック対応検定
回帰分析
X-Yの関係をモデル化
重回帰分析
複数の予測変数
クロンバックのα
尺度の信頼性
ロジスティック回帰
二値アウトカムの予測
因子分析
潜在因子構造の探索
クラスカル・ウォリス
ノンパラメトリック3群以上比較
反復測定
被験者内分散分析
二元配置分散分析
要因計画の分析
フリードマン検定
ノンパラメトリック反復測定
フィッシャーの正確検定
2×2表の正確検定
マクネマー検定
対応のある名義データの検定
Excel/スプレッドシートから貼り付け、またはCSVファイルをドロップ
データを入力して「計算」をクリックしてください
または「サンプルデータを読み込む」をクリックしてお試しください