Frank Popham: Effect measures, having my cake and eating it?

Frank Popham

There is a never-ending debate in epidemiology around the appropriate effect to report for binary outcomes. Should it be the odds ratio, the relative risk or the absolute difference? Which begs the question, why not report them all?

Before diving into some modelling let’s define terms. I want an average effect. In my example data with a binary outcome (Y), a binary exposure (X) and a binary confounder (C), I want an effect for everyone exposed versus everyone unexposed for a population that equals the mean of C (39.6%). My effect could be a “marginal” average or a “conditional” average. For marginal I average probabilities while for conditional I average (log) odds before final transformation to the scale of my effect. Table 1 shows the probability and the odds of Y over X and C. I now average by X, using the percentage C as weights (so C equals 0 gets most weight). In a causal framework these are potential outcomes as we are imagining the whole population being exposed versus being unexposed.

Table 1: Summary of the data
X	C	PrY¹	oddsY	%C
0	0	0.130	0.150	60.4
0	1	0.333	0.500	39.6
1	0	0.220	0.283	60.4
1	1	0.411	0.699	39.6
¹ Pr=Probability

Table 2 shows the odds and probability of Y for the values of X weighted by C. Along with marginal (weighting strata specific probabilities) and conditional (weighting strata specific odds) effects (difference in probability, relative “risk” and the odds ratio). To illustrate non equivalence of the marginal and conditional in the second column of each set, I have converted probability to odds and odds to probability.

Table 2: Marginal and conditional point effects
X¹	Marginal		Conditional
X¹	PrY²	odds_from_PrY	oddsY	Pr_from_oddsY
0	0.211	0.267	0.241	0.194
1	0.296	0.420	0.405	0.288
PrD	0.085	0.085	0.094	0.094
RR	1.41	1.41	1.48	1.48
OR	1.58	1.58	1.68	1.68
¹ PrD=Probability Difference, RR = Relative risk, OR=Odds ratio
² Pr=Probability

Model time

Now let’s use a logistic regression with an interaction to model the effect of X on Y given C. To read the average effect directly from the model you can centre C to obtain the conditional odds ratio 1.68 (95% CI 1.07 - 2.55). Which is the same as in Table 2. One way to easily obtain all effect measures with confidence intervals is to make predictions (on the probability scale for marginal effects and odds scale for conditional scale) from this outcome model (centring not needed). If you are a Stata user you can, I think, do this relatively easily using the margins command. In R there is a good range of equivalent commands. However, I use the excellent predictnl command with two copies of the data, one where everyone is X equals 1 and the other where everyone is X equals 0. The first two sections (labelled “Outcome Model”) of Table 3 contain the results. An alternative method is to model the exposure as a function of the confounder and then derive inverse probability weights (ipw) to be used in a logistic model of the exposure on the outcome weighted by the IPW . I prefer this way and I use survey contrast function in R to do the predictions. Results are shown in the last 2 sections of Table 3. While the CIs are similar across the IPW and outcome model, they are not the same (both methods use the delta method I think), so might be an error for someone to spot in the code?

Another potential route (at least for marginal effects) would be use teffect in R or Stata.

Table 3: Effects from outcome model and IPW with CIs
X	Effect	Estimate	95% CI - Low	95% CI - High
Outcome Model - Conditional
0	Pr	0.194	0.165	0.224
1	Pr	0.288	0.208	0.368
1 v 0	PrD	0.094	0.008	0.179
1 v 0	RR	1.48	1.08	2.03
1 v 0	OR	1.68	1.09	2.58
Outcome Model - Marginal
0	Pr	0.211	0.180	0.241
1	Pr	0.296	0.226	0.366
1 v 0	PrD	0.085	0.009	0.161
1 v 0	RR	1.41	1.07	1.85
1 v 0	OR	1.58	1.08	2.31
IPW - Conditional
0	Pr	0.194	0.165	0.224
1	Pr	0.288	0.208	0.368
1 v 0	PrD	0.094	0.008	0.179
1 v 0	RR	1.48	1.08	2.03
1 v 0	OR	1.68	1.09	2.58
IPW - Marginal
0	Pr	0.211	0.179	0.242
1	Pr	0.296	0.225	0.367
1 v 0	PrD	0.085	0.008	0.163
1 v 0	RR	1.41	1.06	1.86
1 v 0	OR	1.58	1.07	2.33

It would be relatively simple to extend this to more complex models and make it a proper function (see code below).

Click the link for the data and code in R. As usually many thanks to R, Rstudio, distill for making blogging easy. Also thanks to knitr, tidyverse, broom, rstpm2, survey, simstudy, and gt package authors.

Effect measures, having my cake and eating it?

Model time

Citation