OLS回帰を使用して複数の予測変数から結果変数を予測します。結果にはR²、VIF付き係数、分散分析表、APA形式の出力が含まれます。
Multiple regression analysis is a statistical technique used to examine the relationship between two or more independent variables (predictors) and a single continuous dependent variable (outcome). While simple regression models the effect of a single predictor, multiple regression incorporates several predictors simultaneously—allowing researchers to assess each variable's unique contribution while controlling for the others. The method uses Ordinary Least Squares (OLS) estimation, which finds the set of coefficients that minimizes the sum of squared residuals between observed and predicted values.
The general equation is Y = b0 + b1X1 + b2X2 + … + bkXk + e, where b0 is the intercept, b1…bk are the unstandardized regression coefficients, and e is the residual error. Multiple regression is appropriate when you want to predict an outcome from multiple factors, understand the relative importance of different predictors, or estimate a variable's effect while holding others constant.
R² and Adjusted R²
R² represents the proportion of variance in the dependent variable explained by the model. However, it always increases when predictors are added—even irrelevant ones. Adjusted R² penalizes for the number of predictors, making it more suitable for comparing models with different numbers of variables.
F-Test and Individual t-Tests
The F-test evaluates whether the overall model is significant (i.e., whether at least one predictor has a non-zero effect). Individual t-tests then assess each predictor's unique contribution while controlling for all other predictors in the model.
Standardized Coefficients (β) and VIF
Unstandardized coefficients (B) are interpreted in the original units. Standardized coefficients (β) allow direct comparison of relative predictor importance. The Variance Inflation Factor (VIF) detects multicollinearity—values above 10 indicate problematic collinearity that inflates standard errors and destabilizes coefficient estimates.
Durbin-Watson Statistic
The Durbin-Watson statistic tests for autocorrelation in residuals, ranging from 0 to 4. Values near 2 indicate no autocorrelation. Values near 0 suggest positive autocorrelation; values near 4 suggest negative autocorrelation. The acceptable range is typically 1.5–2.5.
| Method | Predictors | Outcome | Use Case |
|---|---|---|---|
| Simple Regression | 1 continuous | Continuous | Single predictor-outcome relationship |
| Multiple Regression | 2+ continuous | Continuous | Simultaneous effects of multiple predictors |
| Logistic Regression | Continuous / categorical | Binary (0/1) | Predicting binary outcomes (pass/fail) |
| ANOVA | Categorical (groups) | Continuous | Comparing means across 3+ groups |
1. Linearity
The relationship between each predictor and the outcome must be linear. Check residual-vs-predicted plots for curvilinear patterns.
2. Independence
Observations must be independent. Verify with Durbin-Watson (1.5–2.5). Violations are common in time-series and clustered data.
3. Normality of Residuals
Residuals should be approximately normally distributed. Robust to violations with larger samples (N ≥ 30) due to the Central Limit Theorem.
4. Homoscedasticity
Residual variance should be constant across all predicted values. A "funnel shape" in residual plots indicates heteroscedasticity.
5. No Multicollinearity (VIF < 10)
Predictors should not be highly correlated with each other. Check VIF values and correlation matrices. High multicollinearity inflates standard errors and makes coefficient estimates unstable.
6. No Autocorrelation (Durbin-Watson ≈ 2)
Residuals should not be correlated with each other. Particularly important for time-series data. Use GLS or add lagged variables if violated.
Example
A multiple regression analysis was conducted to predict GPA from study hours, sleep hours, and attendance rate. The model was statistically significant, F(3, 26) = 22.29, p < .001, R² = .72, adjusted R² = .69, explaining approximately 72% of the variance in GPA. Study hours (B = 0.055, β = .49, t = 5.50, p < .001), attendance rate (B = 0.018, β = .33, t = 4.50, p < .001), and sleep hours (B = 0.112, β = .21, t = 2.95, p = .007) were all significant predictors.
StatMate's multiple regression calculations have been validated against R's lm() function and SPSS regression output. We use OLS estimation with the jstat library for F- and t-distributions. All coefficients, standard errors, t-statistics, p-values, R², adjusted R², F-statistics, VIF, and Durbin-Watson values match R and SPSS output to at least 4 decimal places.
t検定
2群の平均値を比較
分散分析
3群以上の平均値を比較
カイ二乗検定
カテゴリ変数の関連を検定
相関分析
関係の強さを測定
記述統計
データを要約
サンプルサイズ
検出力分析・標本計画
1標本t検定
既知の値との比較
マン・ホイットニーU
ノンパラメトリック群間比較
ウィルコクソン検定
ノンパラメトリック対応検定
回帰分析
X-Yの関係をモデル化
クロンバックのα
尺度の信頼性
ロジスティック回帰
二値アウトカムの予測
因子分析
潜在因子構造の探索
クラスカル・ウォリス
ノンパラメトリック3群以上比較
反復測定
被験者内分散分析
二元配置分散分析
要因計画の分析
フリードマン検定
ノンパラメトリック反復測定
フィッシャーの正確検定
2×2表の正確検定
マクネマー検定
対応のある名義データの検定
Excel/スプレッドシートから貼り付け、またはCSVファイルをドロップ
Excel/スプレッドシートから貼り付け、またはCSVファイルをドロップ
Excel/スプレッドシートから貼り付け、またはCSVファイルをドロップ
データを入力して「計算」をクリックしてください
または「サンプルデータを読み込む」をクリックしてお試しください