---
title: "Average Marginal Effects in a 2-Arm Parallel Randomized Controlled Trial with Heterogeneity of Effects by Strata"
author: Wade K. Copeland
date: "`r format(Sys.time(), '%d %B, %Y')`"
header-includes:
- \usepackage{graphicx}
- \usepackage{booktabs}
- \usepackage{mathrsfs}
- \usepackage{pdflscape}
- \newcommand{\blandscape}{\begin{landscape}}
- \newcommand{\elandscape}{\end{landscape}}
- \usepackage{caption}
- \captionsetup{width=7in}
- \usepackage{color, soul, amsmath}
- \usepackage{pdfpages}
- \usepackage{xcolor}
- \usepackage{longtable}
- \usepackage{tikz}
- \usetikzlibrary{calc, shapes, positioning}
- \tikzset{mynode/.style={draw,text width=1in,align=center}}
- \usepackage{amsthm}
- \usepackage[utf8]{inputenc}
- \usepackage[english]{babel}
output:
beamer_presentation:
theme: "Madrid"
slide_level: 2
toc: TRUE
---
# Introduction
## Motivation
The general(ized) linear model is very flexible. However, the reliance on effect estimation that requires fixed covariate values can, in some cases be a severe limitation. This has led statisticians to develop the method of Average Marginal Effects, so named because the method of estimation averages over the marginal effects in a linear model to get a population-averaged marginal effect, which we will call the AME, that no longer depends on fixed covariate values \cite{wooldridge2000econometric, greene2003econometric}.
## Motivation
The theory behind estimating the AME and calculating the standard error of the estimate is complex. This leads to two general problems in the literature discussing them. The first case abstracts the issue to the point that the average user is unable to decipher what to do in a real application \cite{stata2019manual}. The second case avoids all of the theory and only speaks in general terms, quickly switching to black-box solutions, such as the \textit{margins} command in STATA \cite{williams2012using}.
## Motivation
The utility of the method of Average Marginal Effects in many contexts of statistical modeling makes the lack of accessible resources in the literature surrounding them a tragedy for both statisticians and those who consume statistics.
In this presentation, I attempt to solve this problem by deriving the estimate of the AME, and its standard error in the context of a common experimental design; namely, a multisite 2-arm parallel randomized controlled trial (RCT). We follow each section with straight forward programming techniques to apply this method to real data.
# Study Design
## 2-Arm Parallel Randomized Control Trial
A 2-arm parallel randomized controlled trial (RCT) is a study design where subjects are assigned randomly to either treatment or control groups (collectively called \textit{condition}). An intervention is applied to the treatment group, and a placebo or standard of care is applied to the control group. At the follow-up, the results between the two groups are compared \cite{parab2010study}.
## 2-Arm Parallel Randomized Control Trial
\begin{figure}[h!]
\begin{center}
\begin{tikzpicture}
\node[mynode, circle] (p){Population};
\node[mynode, rectangle, above right=of p](control) {Control};
\node[mynode, rectangle, below right=of p](treatment) {Treatment};
\node[mynode, rectangle, right=of control](followup1) {Follow-up};
\node[mynode, rectangle, right=of treatment](followup2) {Follow-up};
\node[mynode, rectangle] (compare) at ($(followup1)!0.5!(followup2)$) {Compare Results};
\draw[->] (p) -- node[auto,font=\footnotesize] {Random Assignment} (control);
\draw[->] (p) -- node[auto,font=\footnotesize] {Random Assignment} (treatment);
\draw[->] (control) -- node[auto,font=\footnotesize] {} (followup1);
\draw[->] (treatment) -- node[auto,font=\footnotesize] {} (followup2);
\draw[->] (followup1) -- node[auto,font=\footnotesize] {} (compare);
\draw[->] (followup2) -- node[auto,font=\footnotesize] {} (compare);
\end{tikzpicture}
\end{center}
\caption{2-Arm Parallel Randomized Controlled Trial Study Design}
\label{fig:rct}
\end{figure}
## Randomization and Causal Effect Estimation
Much of the virtue attributed to the RCT class of experimental designs comes from the process of randomizing participants into either treatment or control groups.
Under ideal circumstances, randomization allows us to estimate causal effects because unmeasured confounders are balanced across the conditions. This is equivalent to saying that we could have switched subjects from either condition at the start of the study and still ended up with the same results (a property called \textit{exchangeability}\textsuperscript{1}) \cite{hernan2006estimating}.
\footnotetext[1]{Formally, exchangeability says that the counterfactual outcomes, one outcome observed (the factual outcome) and one that is unobserved (the outcome that would have been observed in a situation that didn't actually happen), are independent of condition.}
## Multsite RCTs
A multisite RCT is similar only the same experiment is run over multiple sites with each observation within site being randomized (as opposed to randomizing the site such as in a group randomized trial). There are many reasons we would want to do this. Sometimes a single-site RCT is turned into a multisite RCT due to problems at a single location such as low recruitment. A multisite RCT also helps move a trial from only assessing efficacy to assessing the effectiveness by sampling from different populations with different baseline risk factors \cite{kraemer2000pitfalls}.
## Marginal Treatment Effect
Statistically, we can account for multiple sites by adjusting for the condition by site effect. The most straight forward way to model the expectation at the follow-up is as a linear function of condition, site, and site by condition.
## Marginal Treatment Effect
Consider the following data generating process. Let $y_i = \beta_{0} + \beta_{1}c_i + \beta_{2}s_i + \beta_{3}c_is_i + \epsilon_i$ for $i \in \{1, ... . n \}$ such that $y_i \sim N(\beta_{0} + \beta_{1}c_i + \beta_{2}s_i + \beta_{3}c_is_i, \sigma^2_{\epsilon})$.
Let $c_{i} = 0$ for an observation in the control condition and $c_{i} = 1$ for an observation in the treatment condition. Let $s_{i} = 0$ for an observation in the first site and $s_{i} = 1$ for an observation in the second site.
Under these assumptions, the conditional expectation of $y_i$ for a fixed value of $C$ and $S$ is as follows:
$$
E[y_i] = E[y|C = c_i, S = s_i] = \beta_{0} + \beta_{1}c_i + \beta_{2}s_i + \beta_{3}c_i s_i
$$
\footnotetext[1]{To keep things simple, I assume the conditional distribution of the response is normal. In general this isn't necessary.}
## Marginal Treatment Effect
To continue we need to calculate the marginal treatment effect for the $i^{th}$ observation.
The effect of the treatment condition is $E[y|C = 1, S = s_i] = \beta_{0} + \beta_{1}1 + \beta_{2}s_i + \beta_{3}1s_i = \beta_{0} + \beta_1 + \beta_{2}s_i + \beta_{3}s_i$
The effect of the control condition is $E[y|C = 0, S = s_i] = \beta_{0} + \beta_{1}0 + \beta_{2}s_i + \beta_{3}0s_i = \beta_{0} + \beta_{2}s_i$
Therefore the marginal treatment effect\textsuperscript{2} is $E[y|C = 1, S = s_i] - E[y|C = 0, S = s_i] = \beta_{0} + \beta_1 + \beta_{2}s_i + \beta_{3}s_i - (\beta_{0} + \beta_{2}s_i) = \beta_1 + \beta_{3}s_i$
\footnotetext[2]{An important subtlety here is that the marginal treatment effect is calculated for each observation. Let $E[Y_{c = 1}|S = s_{i}]$ be the counterfactual outcome for each observation under the treatment condition and let $E[Y_{c = 0}|S = s_{i}]$ be the counterfactual outcome for each observation under the control condition. Under exchangeability, we can show that $E[Y_{C = c_i}|S = s_{i}] = E[Y|C = c_i, S = s_{i}]$ for all $C \in \{0, 1\}$\cite{hernan2004definition}.}
# Average Marginal Treatment Effect Estimation
## Average Marginal Effect Estimation -- The Problem
The marginal treatment effect, call it $y_{d_i}$, still depends on S, $E[y_{d}|S = s_i]$. While the treatment effect by site may be of interest, generally we are interested in an effect that is averaged over site so that we can make a claim about the population. The basic mechanics of the general(ized) linear model does not allow for this since we can only calculate the marginal expectation for fixed values of site.
## Average Marginal Effect Estimation -- The Problem
A naive approach to the problem would be to test if the interaction of site by condition is significant and then remove it if it isn't. However, this approach falls into the classic trap of saying that failure to reject the null hypothesis is the same as saying there is no effect.
Another potential solution would be to use generalized estimating equations and cluster by site to get the population-averaged treatment effect. However, for those hoping to extend these methods to additional applications\textsuperscript{3}, they will find this solution doesn't provide the flexibility in estimation they will want.
\footnotetext[3]{An example is using the AME to estimate the risk ratio or risk difference for the treatment effect in a logistic regression model. Generalized estimating equations on their own can only estimate the odds ratio for the population-averaged effect. Another example is if we were to use longitudinal models. In this case, we would want to average over the subject-specific effect to account for the correlation and keep site fixed in the expectation, which we can later average over by estimating the AME.}
## Average Marginal Effect Estimation
We can solve this problem in a principled way that will provide for the flexibility we will want later. The first step is to apply the Law of Total Expectation to the marginal effect\textsuperscript{4} \cite{bain2000introduction}.
$$
E_S[E[y_{d}|S]] = E[y_{d}]
$$
\footnotetext[4]{Note that we have made the not-so-subtle transition from treating $S = s_i$ as fixed to $S$ as a random variable. Above, when we calculated the expectation, we could have also treated S as a random variable. The math works out the same regardless.}
## Average Marginal Effect Estimation
Since S partitions the sample space (e.g., Pr(S=0) + Pr(S=1) = 1), we can further simplify.
\begin{equation}
\begin{split} \nonumber
E[Y_d] = \sum_{j=1}^{2}E[Y_d|S_j]Pr(S_j) = \\
E[Y_d|S = 0]Pr(S = 0) + E[Y_d|S = 1]Pr(S = 1) = \\
\beta_{1}Pr(S = 0) + (\beta_1 + \beta_3)Pr(S=1) = \\
\beta_1(Pr(S = 0) + Pr(S=1)) + \beta_3Pr(S=1) = \\
\beta_{1} + \beta_3Pr(S = 1)
\end{split}
\end{equation}
## Average Marginal Effect Estimation
We can see that calculating a marginal treatment effect that is unconditional on site is little more than averaging over the marginal treatment effect for each observation since,
\begin{equation}
\begin{split} \nonumber
\frac{1}{n}\sum_{i=1}^{n}y_{d_{i}} = \\
\frac{1}{n}\sum_{i=1}^{n} (\beta_1 + \beta_{3}s_i) = \\
\frac{1}{n}(n\beta_1 + \beta_3\sum_{i=1}^n s_i) = \\
\beta_1 + \beta_3 \frac{\sum_{i=1}^n s_i}{n} = \\
\beta_1 + \beta_2Pr(S = 1)
\end{split}
\end{equation}
## Average Marginal Effect Estimation
This suggests that we can estimate the AME as follows:
$$
\hat{AME} = \frac{1}{n}\sum_{i=1}^{n}\hat{y}_{d_{i}}
$$
## Average Marginal Effect Estimation Example
As an example, consider the data generating process with the following values: Let $\beta_{0} = 5$, $\beta_1 = 5$, $\beta_2 = 2$, $\beta_3 = 4$ and $\sigma_{\epsilon} = 1$. Let the probability of being in the treatment condition be $1/2$, and the probability of being in site 1 be $2/3$.
With these values the population marginal treatment effect in site 1 is $\beta_1 + \beta_30 = \beta_1 = 5$.
The population marginal treatment effect in site 2 is $\beta_1 + \beta_3 = 5 + 4 = 9$.
The population AME is $\beta_1 + \beta_3Pr(S = 1) = 5 + 4(1-2/3) = `r round(5 + 4*(1/3), 2)`$.
## Average Marginal Effect Estimation Example
To see how close $\hat{AME}$ is to the $AME$ we can simulate some data.
```{r normalSimulation, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
set.seed(123)
alpha = 5; beta1 = 5; beta2 = 2; beta3 = 4
C = rbinom(500, 1, prob = 1/2)
S = rbinom(500, 1, prob = 1/3)
mu = alpha + beta1*C + beta2*S + beta3*C*S
Y = rnorm(n = 500, mean = mu, sd = 1)
d <- as.data.frame(cbind(Y, C, S))
d[1:3, ]
```
## Average Marginal Effect Estimation Example
The coefficients show that the estimation is reasonably close to the truth.
```{r normalEstimates, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
fit <- lm(Y ~ C + S + C*S, data = d)
coef(fit)
```
The estimated marginal treatment effect at site 1 is $\hat{\beta}_1 + \hat{\beta}_30 = `r round(coef(fit)[1], 2)`$. The estimated marginal treatment effect at site 2 is $\hat{\beta}_1 + \hat{\beta}_3 = `r round(coef(fit)[1], 2)` + `r round(coef(fit)[4], 2)` = `r round(coef(fit)[1] + coef(fit)[4], 2)`$. Finally, $\hat{AME} = \hat{\beta_1} + \hat{\beta}_3\bar{S} = `r round(coef(fit)[1], 2)` + `r round(coef(fit)[4])`(`r round(mean(S), 2)`) = `r round(coef(fit)[1] + coef(fit)[4]*mean(S), 2)`$.
# Standard Error of the Average Marginal Effect
## Standard Error of the Average Marginal Effect
Estimation is fun and all, but without a method to calculate uncertainty about our estimate of the AME, we are up a creek. Fortunately, we are not without many paddles to choose from\cite{dowd2014computation}.
We could try to derive variance for the AME directly, but that might be painful in some cases\textsuperscript{5}.
A computational approach might be to use bootstrap resampling \cite{efron1979bootstrap}, but depending on the end user's computer hardware, this can be slow.
A compromise between accuracy and computation is to use the Delta Method \cite{doob1935deltamethod}. The Delta Method uses the first two terms of the Taylor power series expansion of the linear predictor evaluated at the AME to estimate its variance.
\footnotetext[5]{The case in mind is if the expectation is non-linear in the predictors.}
## Standard Error of the Average Marginal Effect
Recall that the conditonal expectaton of $y_i$ is $E[y_i] = E[y|C = c_i, S = s_i] = \beta_{0} + \beta_{1}c_i + \beta_{2}s_i + \beta_{3}c_i s_i$. This can be rewritten in matrix form as:
$$
E[\boldsymbol{Y}_i] =
\begin{bmatrix}
1 & c_1 & s_1 & c_1s_1 \\
1 & c_2 & s_2 & c_2s_2 \\
\vdots & \vdots & \vdots & \vdots \\
1 & c_n & s_n & c_ns_n \\
\end{bmatrix}
\begin{bmatrix}
\beta_0 \\
\beta_1 \\
\beta_2 \\
\beta_3
\end{bmatrix}
= \boldsymbol{X}_i\boldsymbol{\beta}
$$
We calculated the marginal effects for fixed values of S as $E[y_{d_i}] = E[y_{d}|S = s_i] = \beta_1 + \beta_{3}s_i$. This can also rewritten in matrix form.
$$
E[\boldsymbol{Y}_{d_i}] =
\begin{bmatrix}
1 & s_1 \\
1 & s_2 \\
\vdots & \vdots \\
1 & s_n
\end{bmatrix}
\begin{bmatrix}
\beta_1 \\
\beta_3
\end{bmatrix}
= \boldsymbol{X}_{d_i}\boldsymbol{\beta}_d
$$
## Standard Error of the Average Marginal Effect
In the red corner, we have the predicted values of $y_i$, $E[\boldsymbol{Y}_i] = \boldsymbol{X}_i\boldsymbol{\beta}$.
In the blue corner, we have the marginal treatment effects, $E[\boldsymbol{Y}_{d_i}] = \boldsymbol{X}_{d_i}\boldsymbol{\beta}_{d}$.
However, the fighters in our red and blue corners are more similar than they might want to admit. They meet in the middle for a fight, by recognizing that the marginal treatment effects are a transformation of $E[\boldsymbol{Y_i}]$.
$$
E[\boldsymbol{Y}_{d_i}] = g(E[\boldsymbol{Y}_i])
$$
## Standard Error of the Average Marginal Effect
The Delta Method states that under these circumstances, we can calculate the approximate variance of the marginal treatment effects as follows:
$$
Var(ME) \approx \boldsymbol{J}Var(\boldsymbol{\beta})\boldsymbol{J}^T
$$
Where $Var(\boldsymbol{\beta})$ is the variance-covariance matrix of $\boldsymbol{\beta}$, and $\boldsymbol{J}$ is the so-called Jacobian matrix, a matrix of partial derivatives of the marginal effects with respect to the regression coefficients. Therefore each row (one for each of the $i$ observations) is interpreted as the expected change in the marginal effect for an instantaneous change in $\boldsymbol{\beta}$.
$$
J =
\begin{bmatrix}
\frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_{0}} & \frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_1} & \frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_2} & \frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_3}
\end{bmatrix} =
\begin{bmatrix}
0 & 1 & 0 & s_1 \\
0 & 1 & 0 & s_2 \\
\vdots & \vdots & \vdots & \vdots \\
0 & 1 & 0 & s_n
\end{bmatrix}
$$
## Standard Error of the Average Marginal Effect
This still does not quite get us all of the way. $\boldsymbol{J}$ is a $nx4$ matrix, $Var(\boldsymbol{\beta})$ is a 4x4 matrix, and $\boldsymbol{J}^T$ is a 4xn matrix. Together we have an $nxn$ matrix where each entry is the approximate variance-covariance of the marginal effects. However, what we want is the approximate variance of the AME. In other words, we need something that we might equate to an "average Jacobian".
## Standard Error of the Average Marginal Effect
We can justify the replacement of $g(E[\boldsymbol{Y}_i])$ with the $AME = \frac{1}{n} \sum_{i=1}^{n} g(E[\boldsymbol{Y}_i])$ by noting that both $1/n$ and the summation can be pulled out of the partial derivative with respect to $\boldsymbol{\beta}$, so that our new and improved Jacobian is as follows \cite{dowd2014computation}:
\begin{equation}
\begin{split} \nonumber
J_{AME} = \\
\frac{1}{n} \sum_{i=1}^{n} \begin{bmatrix}
\frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_0} & \frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_1} & \frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_2} & \frac{\partial g(E[\boldsymbol{Y}_i])}{\partial \beta_3}
\end{bmatrix} = \\
\begin{bmatrix}
0 & 1 & 0 & \bar{S}
\end{bmatrix}
\end{split}
\end{equation}
## Standard Error of the Average Marginal Effect
Putting it all together we have the Jacobian of the AME which is an $1x4$ row vector, the $Var(\boldsymbol{\beta})$ which is a 4x4 matrix, and the transpose of the Jacobian which is a $4x1$ column vector. This leaves us with a tidy scalar value for the approximate variance of the AME.
$$
Var(AME) \approx \boldsymbol{J}_{AME}Var(\boldsymbol{\beta})\boldsymbol{J}_{AME}^T
$$
Therefore, the standard error of the population AME is $SE(AME) = \sqrt{Var(AME)}$ allowing us to get the standard error of estimate of the AME by replacing it with $\hat{AME}$.
$$
SE(\hat{AME}) = \sqrt{Var(\hat{AME})}
$$
## Standard Error of the Average Marginal Effect -- Example
To show how this works in application we continue with where we left off previously. Getting the variance-covariance for $\boldsymbol{\hat{\beta}}$ is straight forward.
```{r normalVcov, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
round(vcov(fit), 4)
```
## Standard Error of the Average Marginal Effect -- Example
Creating the Jacobian matrix is a little more involved. Our expression for the marginal effect is $\hat{\beta}_1 + \hat{\beta}_3S$ can be interpreted in R as follows:
```{r normalExpr, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
meExpr <- parse(text = "beta_hat_1 + beta_hat_3*S")
meExpr
```
## Standard Error of the Average Marginal Effect -- Example
R can calculate the symbolic derivatives for our expression as follows:
```{r normalGradient, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
meGrad <- deriv(meExpr, c("beta_hat_0", "beta_hat_1",
"beta_hat_2", "beta_hat_3"))
meGrad
```
## Standard Error of the Average Marginal Effect -- Example
we can set the values we want to evaluate the expression at to calculate the Jacobian for the $i^{th}$ marginal effect.
```{r normalMEJ, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
beta_hat_1 = coef(fit)["C"]
beta_hat_3 = coef(fit)["C:S"]
S = model.matrix(fit)[, "S"]
J_me <- attr(eval(meGrad), "gradient")
J_me[1:3, ]
```
## Standard Error of the Average Marginal Effect -- Example
Finally, we can take the column-wise mean to get the Jacobian for $\hat{AME}$.
```{r normalAMEJ, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
J_ame <- t(matrix(apply(J_me, 2, mean)))
J_ame
```
## Standard Error of the Average Marginal Effect -- Example
Putting it all together the standard error of $\hat{AME}$ is as follows:
```{r normalAMESE, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
ame_se <- sqrt(J_ame %*% vcov(fit) %*% t(J_ame))
ame_se
```
# Statistical Inference
## Statistical Inference
The final piece of the puzzle is to apply inferential statistics to test the null hypothesis of no average marginal treatment effect and calculate confidence intervals.
We can test the null hypothesis, $H_0: \hat{AME} = 0$, by using a simple Z-test.
$$
Z_{obs} = \frac{\hat{AME} - \hat{AME}_{H_0}}{SE(\hat{AME})} = \frac{\hat{AME}}{SE(\hat{AME})}
$$
Under a true null $Z \sim N(0, 1)$. Therefore under the assumption that the null hypothesis of no average marginal treatment effect is true, the P-value is two times the upper tail probability of observing $Z_{obs}$ or greater. The $95\%$ confidence interval for $\hat{AME}$ is then:
$$
\hat{AME} \pm Z_{.975}SE(\hat{AME}) = \hat{AME} \pm 1.96SE(\hat{AME})
$$
## Statistical Inference -- Example
To show statistical inference in application we continue where we left off.
The p-value is calculated below and shows that we can reject the null hypothesis of no average marginal treatment effect with gusto!
```{r normalAMEpval, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
z <- (coef(fit)[1] + coef(fit)[4]*mean(d$S))/ame_se
p <- 2*pnorm(abs(z), lower.tail = FALSE)
p
```
## Statistical Inference -- Example
The lower and upper 95% confidence limits are calculated below. The results show that there is good evidence that the population AME lies between the bounds `r round((coef(fit)[1] + coef(fit)[4]*mean(d$S)) - 1.96*ame_se, 2)` and `r round((coef(fit)[1] + coef(fit)[4]*mean(d$S)) + 1.96*ame_se, 2)`\textsuperscript{6}.
```{r normalAMEci, eval = TRUE, echo = TRUE, results = TRUE, warning = FALSE, message = FALSE}
ci_lb <- (coef(fit)[1] + coef(fit)[4]*mean(d$S)) -
1.96*ame_se
ci_ub <- (coef(fit)[1] + coef(fit)[4]*mean(d$S)) +
1.96*ame_se
paste("[", round(ci_lb, 2), ", ", round(ci_ub, 2), "]",
sep = "")
```
\footnotetext[6]{Specifically, if we repeat the experiment many times, calculating the confidence interval each time, we would expect them to contain the population AME $95\%$ of the time.}
# Questions
## References
\footnotesize{
\begin{thebibliography}{99}
\bibitem[Bain \& Engelhardt, 2000]{bain2000introduction} Bain, Lee J and Engelhardt, Max, 2000
\newblock Introduction to probability and mathematical statistics
\newblock \emph{Brooks/Cole}.
\bibitem[Doob, 1935]{doob1935deltamethod} Doob, J. L., 1935
\newblock The Limiting Distributions of Certain Statistics
\newblock \emph{Annals of Mathematical Statistics}, 6, 160–-169.
\bibitem[Dowd, Greene, \& Norton, 2014]{dowd2014computation} Dowd, Bryan E and Greene, William H and Norton, Edward C, 2014
\newblock Computation of standard errors
\newblock \emph{Health services research}, 49(2), 731-750.
\bibitem[Efron, 1979]{efron1979bootstrap} Efron, Bradley, 1979
\newblock Bootstrap methods: another look at the jackknife
\newblock \emph{Annals of Statistics}, 7, 1--26.
\end{thebibliography}
}
## References
\footnotesize{
\begin{thebibliography}{99}
\bibitem[Greene, 2003]{greene2003econometric} Greene, William H, 2003
\newblock Econometric analysis
\newblock \emph{Pearson Education India}.
\bibitem[Hern{\'a}n, 2004]{hernan2004definition} Hern{\'a}n, Miguel Angel, 2004
\newblock A definition of causal effect for epidemiological research
\newblock \emph{Journal of Epidemiology \& Community Health} 58(4), 265-271.
\bibitem[Hern{\'a}n \& Robins, 2006]{hernan2006estimating} Hern{\'a}n, Miguel A and Robins, James M, 2006
\newblock Estimating causal effects from epidemiological data
\newblock \emph{Journal of Epidemiology \& Community Health} 60(7), 578--586.
\bibitem[Kraemer, 2000]{kraemer2000pitfalls} Kraemer, Helena Chmura, 2000
\newblock Pitfalls of multisite randomized clinical trials of efficacy and effectiveness
\newblock \emph{Schizophrenia Bulletin} 26(3), 533--541.
\end{thebibliography}
}
## References
\footnotesize{
\begin{thebibliography}{99}
\bibitem[Parab \& Bhalerao, 2010]{parab2010study} Parab, Shraddha and Bhalerao, Supriya, 2010
\newblock Study designs
\newblock \emph{International journal of Ayurveda research} 1(2), 128.
\bibitem[R Core Team 2019]{rprogramming2019} R Core Team, 2019
\newblock R: A language and environment for statistical computing
\newblock \emph{R Foundation for Statistical Computing, Vienna,
Austria}.
\newblock \emph{URL https://www.R-project.org/}.
\bibitem[StataCorp. 2019]{stata2019manual} StataCorp, 2019
\newblock Stata 16 Base Reference Manual
\newblock \emph{College Station, TX: Stata Press}.
\bibitem[Williams, 2012]{williams2012using} Williams, Richard, 2012
\newblock Using the margins command to estimate and interpret adjusted predictions and marginal effects
\newblock \emph{The Stata Journal} 12(2), 308--331.
\bibitem[Wooldridge, 2000]{wooldridge2000econometric} Wooldridge, Jeffrey M, 2000
\newblock Econometric analysis of cross section and panel data
\newblock \emph{MIT press}.
\end{thebibliography}
}