Why Logistic Regression Reporting Is Challenging
Logistic regression is one of the most widely used statistical techniques in health sciences, psychology, and education research. Unlike a t-test or ANOVA, where you report a single test statistic and an effect size, logistic regression produces a complex array of output that must be organized into a coherent narrative.
A complete logistic regression write-up involves overall model fit statistics, pseudo R-squared values, classification accuracy, and individual predictor statistics including B coefficients, standard errors, Wald statistics, p values, odds ratios, and confidence intervals. Missing any one of these elements is a common reason for revision requests from journal reviewers.
This guide walks through every component step by step, with concrete APA 7th edition examples you can adapt for your own manuscript.
The Key Statistics to Report
Before diving into the reporting templates, here is a summary of the statistics that belong in a logistic regression write-up.
Overall model fit:
- Omnibus chi-square test (model chi-square)
- Degrees of freedom and p value
- Pseudo R-squared values: Nagelkerke R² and Cox & Snell R²
Classification performance:
- Overall classification accuracy (percentage correct)
- Sensitivity (true positive rate) and specificity (true negative rate)
Individual predictors:
- B (unstandardized logistic regression coefficient)
- SE (standard error of B)
- Wald chi-square statistic
- p value
- OR (odds ratio, also labeled Exp(B) in SPSS output)
- 95% CI for the odds ratio
Each of these plays a distinct role. The model fit statistics tell readers whether the set of predictors collectively distinguishes between the two outcome groups. The classification table shows how well the model performs in practice. The individual predictor statistics reveal which variables drive the prediction and by how much.
Step 1: Report Overall Model Fit
The omnibus test of model coefficients evaluates whether the full model with predictors fits significantly better than a null model with only the intercept. This is reported as a chi-square test.
APA template:
A binary logistic regression was performed to examine the effects of [predictors] on [outcome]. The overall model was statistically significant, chi-square(df) = X.XX, p = .XXX, indicating that the predictors, as a set, reliably distinguished between [group 1] and [group 2]. The model explained XX.X% (Nagelkerke R²) of the variance in [outcome].
Example:
A binary logistic regression was performed to examine the effects of GPA, weekly study hours, and class attendance on graduation status. The overall model was statistically significant, chi-square(3) = 34.72, p < .001, indicating that the predictors, as a set, reliably distinguished between students who graduated and those who did not. The model explained 31.5% (Nagelkerke R²) of the variance in graduation status.
Understanding Pseudo R-Squared
Unlike ordinary least squares regression, logistic regression does not produce a true R² value. Instead, it offers pseudo R-squared measures that approximate the proportion of variance explained.
| Measure | Range | Notes | |---------|-------|-------| | Cox & Snell R² | 0 to < 1 | Cannot reach 1.0; tends to underestimate | | Nagelkerke R² | 0 to 1 | Adjusted version of Cox & Snell; preferred for reporting |
Most APA-style articles report Nagelkerke R² because its upper bound is 1.0, making it more interpretable. Some researchers report both values for transparency. Either approach is acceptable, but always label which pseudo R-squared you are using.
Step 2: Report the Classification Table
The classification table summarizes how accurately the model assigns cases to the correct outcome group. This is a practical measure of model performance that complements the statistical significance tests.
APA template:
The model correctly classified XX.X% of cases overall, with a sensitivity of XX.X% (correctly predicted [positive outcome]) and a specificity of XX.X% (correctly predicted [negative outcome]).
Example:
The model correctly classified 78.3% of cases overall, with a sensitivity of 82.1% (correctly predicted graduation) and a specificity of 71.4% (correctly predicted non-graduation).
Classification Table Format
| Observed | Predicted: Did Not Graduate | Predicted: Graduated | Percentage Correct | |----------|----------------------------|---------------------|--------------------| | Did Not Graduate | 45 | 18 | 71.4% | | Graduated | 15 | 69 | 82.1% | | Overall | | | 78.3% |
When interpreting classification accuracy, consider the base rate. If 70% of students in your sample graduated, a model that simply predicts everyone graduates would achieve 70% accuracy with no predictors. Your model's accuracy should be evaluated against this baseline.
Step 3: Report Individual Predictors
Individual predictor results are best presented in a table followed by a narrative description. The APA-formatted table for logistic regression includes columns for B, SE, Wald chi-square, p, odds ratio (OR), and the 95% confidence interval for the odds ratio.
Logistic Regression Coefficients Table
| Predictor | B | SE | Wald chi-square | p | OR | 95% CI for OR | |-----------|------|------|-----------|------|------|---------------| | (Constant) | -8.42 | 2.15 | 15.33 | < .001 | -- | -- | | GPA | 1.63 | 0.52 | 9.82 | .002 | 5.10 | [1.84, 14.15] | | Study hours | 0.18 | 0.07 | 6.61 | .010 | 1.20 | [1.04, 1.37] | | Attendance (%) | 0.04 | 0.02 | 4.00 | .045 | 1.04 | [1.00, 1.08] |
Writing Up the Individual Predictors
GPA was a significant predictor of graduation status, B = 1.63, SE = 0.52, Wald chi-square(1) = 9.82, p = .002, OR = 5.10, 95% CI [1.84, 14.15]. For each one-point increase in GPA, the odds of graduating were approximately 5.10 times greater. Weekly study hours also significantly predicted graduation, B = 0.18, SE = 0.07, Wald chi-square(1) = 6.61, p = .010, OR = 1.20, 95% CI [1.04, 1.37]. Each additional hour of weekly study was associated with a 20% increase in the odds of graduating. Class attendance percentage was a marginally significant predictor, B = 0.04, SE = 0.02, Wald chi-square(1) = 4.00, p = .045, OR = 1.04, 95% CI [1.00, 1.08].
Note that the Wald chi-square has 1 degree of freedom for each individual predictor (assuming the predictor is not a multi-category variable with dummy coding). Always include the degrees of freedom in parentheses.
Interpreting Odds Ratios
The odds ratio (OR) is the primary effect size measure in logistic regression. It tells you how the odds of the outcome change with a one-unit increase in the predictor, holding all other predictors constant.
Odds Ratio Reference Guide
| OR Value | Interpretation | |----------|---------------| | OR = 1.00 | No effect; predictor does not change the odds | | OR > 1.00 | Increased odds of the outcome | | OR < 1.00 | Decreased odds of the outcome |
Concrete example: An OR of 2.45 means that for a one-unit increase in the predictor, the odds of the outcome occurring are 2.45 times greater (or 145% higher). An OR of 0.60 means that for a one-unit increase, the odds decrease by 40% (calculated as 1 - 0.60 = 0.40).
Continuous vs. Categorical Predictors
For continuous predictors, the OR reflects a one-unit change on the original measurement scale. If study hours is measured in hours per week, OR = 1.20 means each additional hour increases the odds by 20%. Be mindful of the unit: if the variable were measured in minutes, the OR would be much smaller per unit and harder to interpret. Consider rescaling continuous predictors (e.g., per 10-hour increase) if the per-unit OR is very close to 1.00.
For categorical predictors (dummy coded), the OR compares the odds of the outcome in the coded group to the reference group. If treatment group (coded 1) versus control group (coded 0) has OR = 3.20, the treatment group has 3.20 times the odds of the outcome compared to the control group.
Important Distinction: Odds Are Not Probabilities
A common error is interpreting an odds ratio as a probability ratio. Saying "patients in the treatment group were 3.20 times more likely to recover" is technically imprecise. The correct phrasing is "the odds of recovery were 3.20 times greater in the treatment group." When the outcome is rare (less than 10% prevalence), the odds ratio approximates the risk ratio, but for common outcomes, the two diverge substantially.
Complete APA Reporting Example
Below is a full write-up combining all elements from the previous sections. This example uses the graduation prediction scenario with three predictors.
A binary logistic regression was conducted to predict graduation status (graduated vs. did not graduate) from GPA, weekly study hours, and class attendance percentage among 147 undergraduate students. The overall model was statistically significant, chi-square(3) = 34.72, p < .001, Nagelkerke R² = .32, indicating that the set of predictors reliably distinguished between students who graduated and those who did not. The model correctly classified 78.3% of cases, with a sensitivity of 82.1% and a specificity of 71.4%.
GPA was the strongest predictor of graduation, B = 1.63, SE = 0.52, Wald chi-square(1) = 9.82, p = .002, OR = 5.10, 95% CI [1.84, 14.15]. For each one-point increase in GPA, the odds of graduating were approximately five times greater. Weekly study hours also significantly predicted graduation, B = 0.18, SE = 0.07, Wald chi-square(1) = 6.61, p = .010, OR = 1.20, 95% CI [1.04, 1.37], with each additional hour associated with a 20% increase in odds. Class attendance made a smaller but statistically significant contribution, B = 0.04, SE = 0.02, Wald chi-square(1) = 4.00, p = .045, OR = 1.04, 95% CI [1.00, 1.08].
This example follows a clear structure: state the analysis and sample, report overall model fit and classification accuracy, then describe each predictor with its full set of statistics. Reviewers can quickly locate every required piece of information.
Common Mistakes to Avoid
Reporting B Coefficients Without Odds Ratios
The B coefficient in logistic regression is a log-odds value, which is not intuitively interpretable. Always convert B to the odds ratio (OR = e^B) and report both. The odds ratio is the effect size that readers and reviewers expect to see.
Omitting Confidence Intervals for Odds Ratios
The 95% confidence interval for the OR is essential. It communicates the precision of the odds ratio estimate and reveals whether the effect could be trivially small or substantially large. An OR of 2.50 with a 95% CI of [0.85, 7.35] crosses 1.00 and would not be statistically significant, whereas an OR of 2.50 with a CI of [1.40, 4.46] provides much stronger evidence.
Confusing Odds Ratios With Probabilities
As noted above, odds and probabilities are mathematically different. An OR of 3.00 does not mean the outcome is "three times as likely." It means the odds are three times greater. This distinction matters most when the outcome prevalence is high.
Not Reporting Model Fit Statistics
Some researchers jump directly to individual predictors without reporting the omnibus model test, Nagelkerke R², or classification accuracy. Without these, readers cannot evaluate whether the overall model is meaningful before examining individual effects.
Using R-Squared Instead of Pseudo R-Squared
Logistic regression does not produce a traditional R² value. Reporting R² = .32 without specifying that it is Nagelkerke R² (or Cox & Snell R²) is misleading. Always label the type of pseudo R-squared you are reporting.
Ignoring the Hosmer-Lemeshow Test
The Hosmer-Lemeshow goodness-of-fit test evaluates whether the model fits the data well. A non-significant result (p > .05) indicates adequate fit. While not always required, reporting this test strengthens your write-up, especially when reviewers are concerned about model calibration.
Logistic Regression APA Checklist
Before submitting your manuscript, verify that your logistic regression results include:
- The type of logistic regression performed (binary, multinomial, ordinal)
- Sample size and outcome group frequencies
- Overall model chi-square with degrees of freedom and p value
- Pseudo R-squared (Nagelkerke R² preferred) clearly labeled
- Classification accuracy (overall percentage, sensitivity, specificity)
- A coefficients table with B, SE, Wald chi-square, p, OR, and 95% CI for OR
- The intercept (constant) row in the coefficients table
- Narrative interpretation of odds ratios for significant predictors
- Odds ratios interpreted correctly (not as probabilities)
- All statistical symbols in italics (B, SE, p, chi-square, R²)
- Exact p values (or p < .001 for very small values)
- Assumption checks mentioned (linearity of logit, multicollinearity, outliers)
Try StatMate's Free Logistic Regression Calculator
Assembling all these statistics from SPSS or R output and formatting them correctly is time-consuming and error-prone. StatMate's logistic regression calculator handles the entire process automatically.
Enter your binary outcome and predictor variables, and StatMate computes the omnibus model test, Nagelkerke R², classification table, and individual predictor statistics including B, SE, Wald chi-square, p, odds ratios, and 95% confidence intervals. The results are formatted in APA 7th edition style, ready to copy directly into your manuscript.
The calculator also generates an odds ratio forest plot that visually displays each predictor's OR with its confidence interval, making it easy to identify which predictors have the strongest effects and whether their intervals cross 1.00. You can export the complete results to Word with one click.
By letting StatMate handle the calculations and formatting, you eliminate common errors like missing confidence intervals, unlabeled pseudo R-squared values, or incorrectly computed odds ratios, and focus your time on interpreting your findings.