fullflu-english

English version blog of fullflu

Bias of the Hajek estimator (potential errors in Technical Point 12.1 of Causal inference: what if)

Summary

This article claims that the Hajek (modified Horvitz-Thompson) estimator is not an unbiased estimator of the expectation of potential outcome.

Specifically, I point out that Technical Point 12.1 of Causal inference: what if would contain potential error about the bias.

PDF version of the book is published on this page

The link of Goodreads is here:

20190411182013

I also propose to refer the name of the Hajek estimator when introducing the modified Horvitz-Thompson estimator.

I emphasize my proposition by code-block as follows:

<Proposition 1>
I propose ...

I would appreciate it if the experts of causal inference could discuss this issue with me and reflect the discussion to the book, Causal inference: what if.

Notation

  •  Y \in \mathbb{R}: Random variable of outcome
  •  A \in \mathcal{A}: Random variable of treatment. This article focus on discrete treatment.
  •  L \in \mathbb{R}^d: Random variable of covariate
  •  Y^{a}: Potential outcome of treatment level  a
  •  I(\cdot, \cdot): Indicator function
  •  f(A \mid L): Conditional probability of treatment given  L.

Introducing Technical Point 12.1

This chapter introduces related statements in Technical Point 12.1

In Technical Point 12.1, IP weighted (IPW) mean for treatment level  a is defined as follows:

\begin{eqnarray} \mathbb{E} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr] \tag{1} \end{eqnarray}

In Technical Point 2.3 and 3.1, this expectation is shown to be equal to the counterfactual mean (expectation of potential outcome),  \mathbb{E}[Y^{a}].

The estimator (empirical approximation) of this weighted mean is known as the Horvitz-Thompson estimator,

\begin{eqnarray} \hat{\mathbb{E}} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr] \tag{2} \end{eqnarray}

Next, the modified Horvitz-Thompson estimator is defined as follows:

\begin{eqnarray} \frac{\hat{\mathbb{E}} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr]} {\hat{\mathbb{E}} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr]} \tag{3} \end{eqnarray}
<Proposition 1>
I found that the estimator of equation (3) is named the Hajek estimator.
Therefore, I propose to refer the name of the Hajek estimator when introducing the modified Horvitz-Thompson estimator.

In Technical Point 12.1, the Hajek estimator is introduced as an unbiased estimator of the following ratio of expectations (this statement might be incorrect):

\begin{eqnarray} \frac{\mathbb{E} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr]} {\mathbb{E} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr]} \tag{4} \end{eqnarray}

Under positivity, this ratio of expectations is proved to be equal to the IP weighted mean because

\begin{eqnarray} \mathbb{E} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr] = 1 \tag{5} \end{eqnarray}

Bias of Hajek estimator

In this chapter, I point out that the Hajek estimator is not an unbiased estimator of the expectation of potential outcome.

Ratio estimator

Ratio estimator is defined to be the ratio of means of two random variables (https://en.wikipedia.org/wiki/Ratio_estimator).

In general, ratio estimates are known to be biased.

The bias is shown as equation (8.5) of the link: https://jkim.public.iastate.edu/teaching/book8.pdf.

Hajek estimator and Ratio estimator

As the equation (8.14) of the teaching book above, the Hajek estimator is a special case of the ratio estimator, where  x_i = 1 in the ratio estimator.

<Potential Error 1>
Therefore, the Hajek estimator should be a biased estimator, and the bias would be as follows:
\begin{eqnarray} C &:=& covariance \biggr(\frac{\hat{\mathbb{E}} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr]} {\hat{\mathbb{E}} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr]}, \hat{\mathbb{E}} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr] \biggr) \\ Bias(Hajek) &:=& \mathbb{E}\biggl[ \frac{\hat{\mathbb{E}} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr]} {\hat{\mathbb{E}} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr]} \biggr] - \frac{\mathbb{E} \Bigl[ \frac{I(A, a)Y}{f(A \mid L)} \Bigr]} {\mathbb{E} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr]}\\ &=& -\frac{C}{ \mathbb{E}\biggl[ \hat{\mathbb{E}} \Bigl[ \frac{I(A, a)}{f(A \mid L)} \Bigr] \biggr] }\\ &=& -C \tag{6} \end{eqnarray}