Skip to content
S
StatMate
Back to Blog
APA Reporting17 min read2026-03-26

How to Report Logistic Regression in APA 7th Edition — Odds Ratio, Wald Test & Model Fit

Step-by-step guide to reporting binary logistic regression in APA 7th edition format. Odds ratios with confidence intervals, Wald chi-square, Nagelkerke R², classification accuracy, and copy-paste APA templates.

Why Logistic Regression Reporting Requires Extra Care

Logistic regression is the go-to method for predicting a binary outcome in fields ranging from psychology to epidemiology to education. Unlike a t-test or ANOVA, where you report one test statistic and one effect size, logistic regression produces a layered set of outputs: overall model fit, pseudo R-squared values, classification accuracy, and individual predictor statistics that include B coefficients, standard errors, Wald chi-square values, p values, odds ratios, and confidence intervals.

The reporting challenge is not just volume. Each statistic in logistic regression has a distinct interpretation that differs from its linear regression counterpart. The B coefficient is a log-odds value rather than a slope in original units. The pseudo R-squared is not the same quantity as the R-squared from ordinary least squares regression. And the odds ratio, the central effect size measure, is routinely misinterpreted as a probability ratio.

Missing any one element from a logistic regression write-up is among the most frequent reasons for revision requests. Misinterpreting odds ratios is another. This guide provides a complete, step-by-step approach to reporting binary logistic regression results in APA 7th edition format. Every template can be copied directly into your manuscript and adapted with your own numbers.

When to Use Logistic Regression

Binary Outcome Variable

Logistic regression is appropriate when your dependent variable has exactly two categories: yes/no, pass/fail, diagnosed/healthy, graduated/dropped out. If you find yourself coding a continuous outcome into two groups just to use logistic regression, reconsider whether linear regression or another technique is more appropriate.

The outcome variable does not need to be naturally dichotomous. Researchers sometimes dichotomize continuous outcomes (e.g., "high" versus "low" depression scores) to create a binary variable. This practice discards information and reduces statistical power, so it should be justified on substantive grounds rather than convenience.

Multiple Predictors (Continuous and Categorical)

Logistic regression handles both continuous predictors (age, GPA, income) and categorical predictors (gender, treatment group, education level) in the same model. Categorical predictors with more than two levels are entered as dummy-coded variables, with one category serving as the reference group.

This flexibility makes logistic regression one of the most widely used multivariate techniques. A single model can simultaneously examine whether GPA (continuous), study hours (continuous), and attendance status (categorical: regular versus irregular) predict graduation.

Comparison With Linear Regression

Linear regression predicts a continuous outcome and assumes normally distributed residuals with constant variance. When the outcome is binary, these assumptions are fundamentally violated. Logistic regression resolves this by modeling the log-odds of the outcome rather than the outcome itself, producing predicted probabilities bounded between 0 and 1.

The key differences in reporting:

| Feature | Linear Regression | Logistic Regression | |---------|-------------------|---------------------| | Outcome variable | Continuous | Binary (0/1) | | Coefficient interpretation | Change in Y per unit X | Change in log-odds per unit X | | Primary effect size | B or beta | Odds ratio (OR) | | Model fit | R², adjusted R² | Nagelkerke R², classification accuracy | | Overall model test | F-statistic | Chi-square (likelihood ratio) | | Residual assumptions | Normality, homoscedasticity | Linearity of logit |

The Basic APA Format for Logistic Regression

Before walking through a full example, here is the template for each component.

Overall Model Fit

The overall model was statistically significant, chi-square(df) = X.XX, p = .XXX, Nagelkerke R² = .XX.

The chi-square here is the omnibus likelihood ratio test comparing the full model (with all predictors) to a null model (intercept only). Report the degrees of freedom (equal to the number of predictors), the chi-square value to two decimal places, and the exact p value.

Individual Predictors

B = X.XX, SE = X.XX, Wald chi-square(1) = X.XX, p = .XXX, OR = X.XX, 95% CI [X.XX, X.XX]

Each predictor receives its own line of statistics. The Wald chi-square has 1 degree of freedom for a single predictor (more for multi-level categorical variables). The odds ratio (OR) is the exponentiated B coefficient: OR = e^B. The 95% confidence interval is for the odds ratio, not for B.

Reporting Logistic Regression: Step by Step

Research Scenario

A university researcher examines whether three variables predict graduation status (graduated vs. did not graduate) among 200 undergraduate students:

  • GPA (continuous, 0.00 to 4.00 scale)
  • Weekly study hours (continuous, hours per week)
  • Class attendance (continuous, percentage 0-100)

The outcome variable is graduation status, coded as 1 = graduated and 0 = did not graduate. Of the 200 students, 128 (64%) graduated and 72 (36%) did not.

Step 1: Report Model Fit Statistics

The first paragraph of your results section establishes whether the model as a whole is meaningful.

APA example:

A binary logistic regression was conducted to examine whether GPA, weekly study hours, and class attendance predicted graduation status among 200 undergraduate students. The omnibus test of model coefficients indicated that the full model was statistically significant, chi-square(3) = 42.86, p < .001. The model explained 28.4% of the variance in graduation status (Nagelkerke R² = .284) and correctly classified 79.5% of cases, with a sensitivity of 84.4% (correctly identified graduates) and a specificity of 70.8% (correctly identified non-graduates). The Hosmer-Lemeshow test indicated adequate model fit, chi-square(8) = 5.73, p = .678.

This single paragraph covers four model fit indices: the omnibus chi-square, Nagelkerke R², classification accuracy (with sensitivity and specificity), and the Hosmer-Lemeshow test. Reviewers can immediately determine that the model is statistically significant and practically meaningful.

Step 2: Present the Coefficient Table

For three or more predictors, a table is clearer than embedding every number in the text.

| Predictor | B | SE | Wald chi-square | p | OR | 95% CI for OR | |-----------|------|------|-------------------|------|------|---------------| | (Constant) | -9.15 | 2.34 | 15.30 | < .001 | -- | -- | | GPA | 1.74 | 0.48 | 13.14 | < .001 | 5.70 | [2.23, 14.57] | | Study hours | 0.16 | 0.06 | 7.11 | .008 | 1.17 | [1.04, 1.32] | | Attendance (%) | 0.04 | 0.02 | 4.00 | .046 | 1.04 | [1.00, 1.08] |

Note. Nagelkerke R² = .284. Model chi-square(3) = 42.86, p < .001. Classification accuracy = 79.5%.

Step 3: Write the APA Narrative

GPA was the strongest predictor of graduation status, B = 1.74, SE = 0.48, Wald chi-square(1) = 13.14, p < .001, OR = 5.70, 95% CI [2.23, 14.57]. For each one-point increase in GPA, the odds of graduating were approximately 5.70 times greater, holding other variables constant. Weekly study hours also significantly predicted graduation, B = 0.16, SE = 0.06, Wald chi-square(1) = 7.11, p = .008, OR = 1.17, 95% CI [1.04, 1.32]. Each additional hour of weekly study was associated with a 17% increase in the odds of graduating. Class attendance made a smaller but statistically significant contribution, B = 0.04, SE = 0.02, Wald chi-square(1) = 4.00, p = .046, OR = 1.04, 95% CI [1.00, 1.08].

This narrative follows a consistent pattern: state the predictor, give the full statistics, then interpret the odds ratio in plain language. Ordering predictors from strongest to weakest effect helps readers grasp the relative importance of each variable.

Interpreting Odds Ratios

The odds ratio (OR) is the primary effect size in logistic regression. It tells you how the odds of the outcome change with a one-unit increase in the predictor, holding all other predictors constant.

OR Greater Than 1: Increased Odds

An OR of 5.70 means that for every one-unit increase in the predictor, the odds of the outcome are 5.70 times greater. To express this as a percentage increase: (5.70 - 1) x 100 = 470% increase in odds.

For continuous predictors, the unit matters. If study hours are measured per week, OR = 1.17 means each additional hour per week increases the odds by 17%. If you rescaled to "per 10 hours," the OR would be 1.17^10 = 4.81, which may be more interpretable.

OR Less Than 1: Decreased Odds

An OR of 0.65 means that a one-unit increase in the predictor decreases the odds by 35%, calculated as (1 - 0.65) x 100. You can equivalently take the reciprocal: 1/0.65 = 1.54, meaning the absence of the factor (or a one-unit decrease) is associated with 1.54 times the odds.

OR Equal to 1: No Effect

An OR of exactly 1.00 means the predictor has no association with the outcome. In practice, an OR very close to 1.00 (e.g., 1.02 or 0.98) with a wide confidence interval typically indicates a trivial or non-significant effect.

Confidence Intervals Crossing 1

The 95% CI for the odds ratio is the key indicator of statistical significance at the .05 level. If the interval contains 1.00, the predictor is not statistically significant. An OR of 1.80 with a CI of [0.72, 4.50] means the true effect could range from a 28% decrease in odds to a 350% increase, providing no clear directional evidence.

Conversely, an OR of 1.80 with a CI of [1.15, 2.82] excludes 1.00 entirely, indicating a statistically significant positive association.

Important distinction: Odds are not probabilities. Saying "students with higher GPAs were 5.70 times more likely to graduate" is technically imprecise. The correct phrasing is "the odds of graduating were 5.70 times greater." When the outcome is rare (less than 10% prevalence), the odds ratio approximates the risk ratio. When the outcome is common, the two diverge substantially.

Model Fit Indices

-2 Log Likelihood

The -2 Log Likelihood (-2LL) is the baseline measure of model fit. It quantifies how well the model reproduces the observed data, with lower values indicating better fit. On its own, -2LL is not easily interpretable. Its primary use is in comparing nested models: the difference in -2LL between two models follows a chi-square distribution, which is the basis for the omnibus test.

The -2 log likelihood for the null model was 258.34 and decreased to 215.48 for the full model, a reduction of 42.86, chi-square(3) = 42.86, p < .001.

Cox & Snell R-Squared vs. Nagelkerke R-Squared

Neither measure is a true R-squared. Both are pseudo R-squared approximations designed to give a familiar metric for explained variance.

| Measure | Range | Key Property | |---------|-------|--------------| | Cox & Snell R² | 0 to < 1 | Cannot reach a maximum of 1.0, which makes its upper bound ambiguous | | Nagelkerke R² | 0 to 1 | Rescales Cox & Snell so the maximum is 1.0; more interpretable |

Nagelkerke R² is the standard choice for APA reporting because it has a clear upper bound. Some authors report both values for transparency. Either way, always label which pseudo R-squared you are using. Writing "R² = .28" without specifying Nagelkerke is ambiguous and may mislead readers into thinking it is an OLS R-squared.

Approximate benchmarks for Nagelkerke R²:

| Nagelkerke R² | Rough Interpretation | |-----------------|---------------------| | .02 to .12 | Small effect | | .13 to .25 | Medium effect | | .26 and above | Large effect |

These benchmarks are domain-dependent. A Nagelkerke R² of .15 may be impressive when predicting rare diseases from demographic variables but modest when predicting diagnosis from clinical biomarkers.

Hosmer-Lemeshow Test

The Hosmer-Lemeshow goodness-of-fit test evaluates whether predicted probabilities match observed outcomes across decile subgroups. A non-significant result (p > .05) indicates adequate model fit. A significant result (p < .05) suggests the model does not calibrate well.

The Hosmer-Lemeshow test indicated adequate model fit, chi-square(8) = 5.73, p = .678.

Caveats: The test is sensitive to sample size. Large samples (N > 500) may flag trivial deviations as significant. Small samples lack power to detect genuine lack of fit. Some methodologists recommend supplementing with calibration plots, especially for prediction-focused studies.

Classification Accuracy Table

The classification table (also called the confusion matrix) shows how well the model assigns cases to the correct outcome group at a default probability cutoff of .50.

| Observed | Predicted: Did Not Graduate | Predicted: Graduated | Percentage Correct | |----------|----------------------------|---------------------|--------------------| | Did Not Graduate | 51 | 21 | 70.8% | | Graduated | 20 | 108 | 84.4% | | Overall | | | 79.5% |

Always compare your classification accuracy to the base rate. In this example, 64% of students graduated. A model that predicts "graduated" for every student would achieve 64% accuracy with zero predictors. The model's 79.5% accuracy represents a 15.5 percentage-point improvement over that naive baseline.

Report sensitivity (correctly predicted positives) and specificity (correctly predicted negatives) alongside overall accuracy. A model with 90% overall accuracy but 20% sensitivity for the minority class is not practically useful.

AUC-ROC

The Area Under the Receiver Operating Characteristic Curve (AUC-ROC) provides a threshold-independent measure of discrimination. Unlike classification accuracy, which depends on a single cutoff, the AUC evaluates the model across all possible cutoffs.

| AUC Value | Discrimination Level | |-----------|---------------------| | .50 | No better than chance | | .60 - .69 | Poor | | .70 - .79 | Acceptable | | .80 - .89 | Excellent | | .90+ | Outstanding |

The area under the ROC curve was .83, 95% CI [.77, .89], indicating excellent discrimination between graduates and non-graduates.

Common Mistakes to Avoid

Reporting Unstandardized B Without the Odds Ratio

The B coefficient in logistic regression is a log-odds value. Unlike the B in linear regression, it has no direct intuitive interpretation. A reader cannot grasp what "B = 1.74" means without converting it to the odds ratio (OR = e^1.74 = 5.70). Always report both B and OR. The odds ratio is the effect size that reviewers and readers expect.

Incomplete: "GPA significantly predicted graduation, B = 1.74, p < .001."

Complete: "GPA significantly predicted graduation, B = 1.74, SE = 0.48, Wald chi-square(1) = 13.14, p < .001, OR = 5.70, 95% CI [2.23, 14.57]."

Not Including Confidence Intervals for Odds Ratios

An odds ratio without a confidence interval is a point estimate with no indication of precision. An OR of 2.50 with a 95% CI of [0.85, 7.35] crosses 1.00 and is not statistically significant. An OR of 2.50 with a CI of [1.40, 4.46] is significant and has a much narrower range of uncertainty. APA 7th edition explicitly requires confidence intervals for effect size measures, and in logistic regression, the 95% CI for each odds ratio fills this role.

Using R-Squared From Linear Regression

Logistic regression does not produce a traditional R-squared. Writing "R² = .28" without specifying that it is Nagelkerke R² (or Cox & Snell R²) misleads readers into thinking you are reporting explained variance in the ordinary least squares sense. Nagelkerke R² values tend to be lower than OLS R-squared values for comparable data, so the distinction matters for interpretation.

Always label the specific pseudo R-squared:

The model explained 28.4% of the variance in graduation status (Nagelkerke R² = .284).

Ignoring Multicollinearity

When two or more predictors are highly correlated, standard errors inflate, individual Wald tests lose power, and coefficient signs can flip. Run a linear regression with the same predictors and inspect the Variance Inflation Factor (VIF). VIF values below 5 are generally acceptable. Values above 10 indicate severe multicollinearity requiring remedial action: remove one of the correlated predictors, combine them into a composite, or use ridge regression.

Multicollinearity was assessed using variance inflation factors. All VIF values were below 2.5 (range: 1.08 to 2.34), indicating no problematic multicollinearity among the predictors.

Interpreting Odds Ratios as Probability Ratios

This is the single most common interpretive error in logistic regression reporting. Odds and probabilities are mathematically different quantities.

Incorrect: "Students with a one-point higher GPA were 5.70 times more likely to graduate."

Correct: "For each one-point increase in GPA, the odds of graduating were 5.70 times greater."

When the outcome is rare (prevalence below 10%), the odds ratio closely approximates the relative risk. When the outcome is common (as in the graduation example, where 64% graduated), the odds ratio substantially overestimates the relative risk.

Not Specifying the Reference Category

For categorical predictors, the odds ratio compares a coded group to its reference category. Failing to state the reference makes the results uninterpretable.

Compared to students with irregular attendance (reference), students with regular attendance had 2.35 times the odds of graduating, OR = 2.35, 95% CI [1.12, 4.93].

Logistic Regression APA Checklist

Before submitting your manuscript, verify that your logistic regression results include:

  • The type of logistic regression (binary, multinomial, ordinal)
  • Sample size and outcome group frequencies
  • Omnibus chi-square with degrees of freedom and p value
  • Nagelkerke R² clearly labeled as pseudo R-squared
  • Classification accuracy with sensitivity and specificity
  • Comparison to the base rate
  • A coefficients table with B, SE, Wald chi-square, p, OR, and 95% CI for OR
  • The intercept (constant) row in the table
  • Odds ratios interpreted as odds, not probabilities
  • Reference categories specified for categorical predictors
  • Confidence intervals for every odds ratio
  • All statistical symbols italicized (B, SE, p, chi-square, R²)
  • Assumption checks mentioned (linearity of logit, multicollinearity, sample size)

Frequently Asked Questions

What is the difference between binary, multinomial, and ordinal logistic regression?

Binary logistic regression predicts a dichotomous outcome (two categories). Multinomial logistic regression predicts an outcome with three or more unordered categories (e.g., preferred transportation: car, bus, bicycle). Ordinal logistic regression predicts an outcome with ordered categories (e.g., disease severity: mild, moderate, severe). Each type produces odds ratios, but the interpretation and model structure differ. Binary logistic regression is the most common and is the foundation for the other variants.

How do I interpret an odds ratio less than 1?

An OR less than 1.00 indicates decreased odds of the outcome. Calculate (1 - OR) x 100 for the percentage decrease. For example, OR = 0.65 means a 35% decrease in odds per one-unit increase in the predictor. Some researchers report the reciprocal (1/OR = 1.54) to frame the finding as increased odds in the opposite direction.

What is the minimum sample size for logistic regression?

The classic guideline is at least 10 events per predictor variable (EPV), where events are cases in the less frequent outcome category. For five predictors with a 30% event rate, you need at least 50 events, requiring a total sample of approximately 167. More recent simulation research recommends EPV of 20 or higher for stable coefficient estimates.

What is the Hosmer-Lemeshow test and when should I report it?

The Hosmer-Lemeshow test evaluates whether predicted probabilities match observed outcomes across subgroups. A non-significant result (p > .05) indicates adequate model fit. Report it to strengthen your write-up, but note it is sensitive to sample size: large samples may flag trivial deviations, while small samples may miss genuine lack of fit.

What is the difference between the Wald test and the likelihood ratio test?

The Wald test evaluates individual predictor coefficients using the coefficient-to-standard-error ratio. The likelihood ratio test compares nested models by examining the change in -2 log likelihood. The likelihood ratio test is generally more reliable, especially for large coefficients where the Wald test can be conservative due to the Hauck-Donner effect. For overall model evaluation, the omnibus likelihood ratio chi-square is preferred.

Calculation Accuracy

Assembling model fit statistics, odds ratios, confidence intervals, and classification tables from SPSS or R output and formatting them into APA 7th edition style is tedious and error-prone. StatMate's logistic regression calculator handles the entire process automatically.

Enter your binary outcome and predictor variables, and StatMate computes the omnibus model test, Nagelkerke R², classification table with sensitivity and specificity, and individual predictor statistics including B, SE, Wald chi-square, p, odds ratios, and 95% confidence intervals. The results are formatted in APA 7th edition style, ready to copy directly into your manuscript.

The calculator also generates an odds ratio chart that visually displays each predictor's OR with its confidence interval, making it easy to identify which predictors have effects that exclude 1.00 and which do not. Export the complete results to Word or PDF with one click.

Try It Now

Analyze your data with StatMate's free calculators and get APA-formatted results instantly.

Start Calculating

Stay Updated with Statistics Tips

Get weekly tips on statistical analysis, APA formatting, and new calculator updates.

No spam. Unsubscribe anytime.