QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIALSCIENCES
1.1. Motivation. The sixth highest killer of Americans is Alzheimers Disease. This fatal
neurological disease has stumped researchers from wealthy countries for over two hundred
years. A recent discovery, however, oers hope to the one-in-three senior citizens who will
develop the disease. Surprisingly, that hope doesnt come from the white walls of a medical
center at a well endowed research university in the States; it comes from the remote hills
of Antioquia, a village outside Medellin, Colombia. After going through historical priests
records at local churches, Dr. Francisco Lopera of Medellins University of Antioquia dis-
covered that members of a particular family had recorded early onset of Alzheimers for over
300 years. More research revealed that in this family, a person has a 50 percent chance of
inheriting a gene, PSEN1, that guarantees early onset of Alzheimers. After the National
Health Institute picked up word of this once in a lifetime experiment opportunity, resources
were quickly gathered to run clinical trials on an antibody that targets the protein amyloid
which is associated with the disease.
The unearthing of this family oers a researchers dream: the perfect conditions to run
a natural experiment. Because subjects and researchers didnt want to know which members
of the family carried the fatal gene, they could randomly apply treatments in a double blind
setting. A double blind study is when neither the subjects or the researchers know which
groups subjects are assigned to. In addition, the fact that members of the family had mostly
stayed in the same location with similar living situations meant that variables that dier
based o location of residence could be controlled for.
Most researchers are not as lucky to find the perfect conditions to infer that a treatment
caused the prevention or cure of a disease. In the real world, treatments tested in the social
sciences are conditionally dependent on covariates. For example, in medicine, drugs are only
tested on sick patients. In economics, researchers test a welfare benefit on the disadvantaged.
Without randomization, selection bias arises: are dierences in the outcome caused by the
selection of who is in the treatment group rather than the treatment itself? This inability
to infer causality has plagued social scientists for decades.1
2 NIKHITA LUTHRA
1.2. Overview. Luckily, statistics oers some solutions to overcome the inability to ran-
domly apply treatments. When random assignment is missing, matching samples based on
particular variables attempts to reduce the bias of estimates of treatment eects. Recent
literature has focused on matching methods that attempt to reduce the bias of confounding
variables that systematically dier between control and treatment populations. Rather than
focusing on various matching methods, we will derive two values, and max
, which are used
to evaluate the success of a matching method, both for the situation when we want to reduce
the bias of a single variable (section 2) and also when we want to reduce the bias of many
covariates (section 3). To do this, we will first describe how to estimate the treatment eect
on a particular outcome variable. Then, we will construct , which captures the reduction
in bias of an estimator due to matching. Finally, we will derive max
, the maximum
percent reduction in bias. This value is important because it intuitively acts as an up-
per bound on how much hope we can have in matchings ability to reduce bias, ultimately
saving the social sciences from their inability to infer causality.
2. MATCHING ON ONE VARIABLE
When is it the case that adjusting for some variable X or variables X gives unbiased
estimates of the treatment eect? This happens whenever the treatment assignment is
strongly ignorable. If r1 is the outcome after receiving the treatment and r0 is the outcome
after not receiving the treatment, treatment assignment is strongly ignorable when: (i) the
responses (r1, r0) are conditionally independent of the treatment z given X and (ii) at each
value of X, theres a positive probability of receiving each treatment . These conditions
are represented mathematically:
Pr(r1, r0, z|X) = Pr(r1, r0|x)(Pr(x|X)) and0 < Pr(z = 1|X) < 1 for all possible X.
Thus, the goal of matching is to construct samples that make treatment assignment as
ignorable as possible. Figure 1 gives an overview of the process of matching.
2.1. Estimating the treatment eect. Imagine the situation where we are interested in
testing the eect of a particular drug on reducing cholesterol levels, represented by Y , a
continuous dependent variable. We begin by assuming that we only have 1 variable, X, to
match on. For example, let X be the age of a patient. We assign X = 1 for patients over 50
years of age and X = 0 for patients under 50 years of age. We want to remove the eect of X
on Y . Suppose we have two populations, P1 and P2, where P1 is the population of patients
that will receive the treatment (because they have high cholesterol), and P2 is the population
of patients that will not receive the treatment (also known as the control population). The
distribution of the matching variable X diers in P1 and P2; patients in the treatment group
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 3
Figure 1. Summary of stages in matched sampling 
tend to be older than those in the control group. Then, f1(X, Y ) 6= f2(X, Y ) where f1, f2are the joint distributions for X and Y in P1, P2, respectively.
The remainder of this section will follow the approach of Rubin . Let G1 be a random
sample from population P1 of size N . Let G2 be a random sample from population P2 of
size rN , r > 1. All subjects in G1 and G2 have recorded values for the random variable X.
Let us choose a sub sample of G2 of size N using a specified matching method. Call this
sub sample G2. Now, we want to estimate the eect of a treatment using G1, G2, both ofsize N . Note that if r = 1, G2 would be a random sample from P2, and matching would
4 NIKHITA LUTHRA
not be able to remove bias due to X. If r = 1, then infinite matches would be obtained,and all bias due to X could be removed.
Definition 1. Define the response surface for Y in Pi
at X = x, denoted Ri
(x) as such:
(x) = E(Y |X = x).
In our example, R1(old) = Echolesterol level | old
gives the expectation of the choles-
terol level for the treatment group given that the patient is old. R2(old) = Echolesterol
level | oldgives the expectation for the cholesterol level for the control group given that the
patient is old. R1(young) = Echolesterol level | young
gives the expectation for the choles-
terol level for the treated group given that the patient is young. R2(young) = Echolesterol
level | younggives the expectation for the cholesterol level for the control group given that
the patient is young.
Definition 2. The eect of the treatment at X = x is: R1(x)R2(x).
Following our example, R1(old) R2(old) reveals the eect of the treatment amongthe old patients. It is the expected cholesterol levels for old people who received the treat-
ment minus the expected cholesterol levels for old people who didnt receive the treatment.
R1(young) R2(young) reveals the eect of the treatment among the young patients. Itis the expected cholesterol level for young people who received the treatment minus the
expected cholesterol level for young people who didnt receive the treatment.
There are two possible cases for the eect of the treatment. The eect of the treatment
can be constant or it can vary with x. These two cases, also referred to as parallel and
non parallel response surfaces, are defined.
Definition 3. If R1(x)R2(x) is constant and independent of X, we call R1(x) and R2(x)parallel response surfaces. In this case, the goal is to estimate this constant dierence.
Parallel response surfaces are depicted in Figure 2.
Definition 4. If R1(x) R2(x) is not constant across all values of X, we call R1(x) andR2(x) non parallel response surfaces. In this case, the goal is to estimate the average
dierence between R1(x) and R2(x) across all x. Non parallel response surfaces are depicted
in Figure 3.
Definition 5. In both cases, we are interested in estimating the treatment eect among
the control and treated populations, , which is equal to the expected dierence in the
response surface: = E1R1(x)R2(x)
Allow y1j, x1j to represent the values of Y , X for the jth subject in G1 and y2j, x2jto represent the values of Y , X for the jth subject in G2, where j = 1...N . Then yij =
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 5
Figure 2. Parallel uni variate response surfaces 
Figure 3. Nonparallel uni variate response surfaces 
, i = 1, 2; j = 1...N . Ec
) = 0. Ec
= the conditional expectation given the xij
We can use this notation to now express an estimator for the treatment eect that is based
o of data we can actually collect from our sub samples.
6 NIKHITA LUTHRA
Definition 6. The estimator for the treatment eect is the average dierence be-
tween the non-parallel response surfaces (or a constant dierence if the response surfaces are
X(Y2i) = y1. y2.
This estimator takes in the data after running the study and outputs an estimate which
is a numerical value that estimates the eect of the drug on cholestoral levels. That numerical
value is known as the estimate. Note that the estimator is a function while the estimate is
2.2. Bias of estimator. Now that we have an estimator for the treatment eect, we need
a way to assess whether the estimator with matching is better than the estimator without
matching. Essentially, we want to estimate how much matching can reduce the bias of an
estimator of the treatment eect. Let E be the expectation over the distribution of X in
the matched samples. Let E2 be the expectation over the distribution of X in the matchedG2 sub samples.
Theorem 1. Using the definition of bias, the expected bias of 0 over matched sam-
pling is EEc
(0 ) = E1R2(x) E2 R2(x).
Proof. Although the proof was not in Rubins original 1973 paper, is quite easy to derive.
Using the above definitions of 0 and ,
(0 ) = EEch(y1. y2.) E1
Since expectations add,
(0 ) = EEc(y1.) EEc(y2.) EEcE1R1(x)
We know EEc
(y1.) = E1R1(x) and EEc(y2.) = E2 R2(x) , so we can rewrite andsimplify:
(0 ) = E1R1(x) E2 R2(x) E1R1(x) + E1R2(x)
= E1R2(x) E2 R2(x).
If the distribution of X in the G2 is the same as that in the random sample G1, thenE1R2(x) = E2 R2(x) and 0 has 0 expected bias. If r = 1 (or in other words if G2 is arandom sample from P2), then the expected bias of = E1R2(x) E2R2(x) where E2 =expectation over the distribution of X in P2. is the estimator of the treatment eect for
the unmatched samples and E1R2(x) E2R2(x) is the bias of that estimator.
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 7
2.3. Measuring reduction in bias due to matching. Now we wish to determine how
much less biased the 0 based on matched sampling is compared to the based on ran-
dom sampling. We will use the percent reduction in expected bias to measure this. It
is essentially the expected bias for matched sampling over the expected bias for random
1001 E1R2(x) E2 R2(x)
The numerator, E1R2(x) E2 R2(x), represents the expected bias from matched sam-pling and the denominator, E1R2(x) E2R2(x), represents the expected bias from randomsampling. The terms that dier are E2R2(x) and E2 R2(x). Multipling by a commondenominator and simplifying yields the expression:
100E2 R2(x) E2R2(x)E1R2(x) E2R2(x)
We can see from this equation that the percent reduction in bias depends only on the
distribution of X in P1 and P2 and G2 and the response surface in P2. We assume thatthe response surface in P2 is linear, or can be estimated by a linear regression: R2(x) =
2 + 2(x 2) where 2 = mean of Y in P2, i = mean of X in Pi, and 2 = regressioncoecient of Y on X in P2. We can use this to rewrite E1R2(x) E2R2(x) = 2(1 2)and E2 R2(x)E2R2(x) = 2(2 2) where 2 = E2 (X) in G2. Substituting in thesevalues and this derivation now gives the following theorem:
Theorem 2. If G1 is a random sample and the response surface in P2 is linear, or can be
estimated by a linear approximation, the percent reduction in bias due to matched
2 21 2
This result allows us to measure the amount a matching method can reduce bias.
2.4. Finding the maximum possible bias reduction. Various matching methods will
yield dierent s. In addition to being able to compare the s of dierent matching methods
to each other, we also want an idea for how good a matching method is on its own. In other
words, it can be costly to apply many dierent matching methods to see which one has the
greatest percent reduction in bias. In real life, a researcher might just pick a single matching
method, but without trying other methods, wants see how successful or unsuccessful the
This is why it is crucial to be able to calculate the maximum possible percent reduction
in bias due to matched sampling. If we can find an upper bound on how much we can
decrease the bias by, then it is much easier to compare a single matching method to that
8 NIKHITA LUTHRA
upper bound rather than repeating the study many times with dierent matching methods.
To get an expression for the maximum percent reduction in bias, we first propose a lemma
that is not proved here but can be found in Rubins work .
Lemma 3. We assume that in population Pi
, X has mean i
, var 2i
and that Xii
, i = 1, 2. The initial bias in X is:
B =1 2q
This makes sense intuitively; in our example, the bias in the age is the dierence in
the mean age between the treatment and control population over the spread of the age in
both populations. From this we can see that if 21 = 22, then the bias is just the number of
standard deviations between the means of X in each of the populations. We are now ready
to present the maximum percent reduction in bias and its proof for the case when we are
matching on one X variable.
where 2(r,N) = expected value of the average of the N largest observations from a sample
of size rN from f2. This sample could be the G2 sample we selected before constructing G2.
Proof. We have commented and added to following proof, which is adapted from Rubins
version . Earlier, we assumed that 1 > 2. This happens to be consistent with our
example: that the average age in the treated population is higher than the average age in
the control population, since cholesterol is positively correlated with age. Then, is the
largest whenever the average age of the control subsample, 2 = E(x2.), is the greatest,which happens when we pick the oldest N subjects from G2 as making up the matched
subsample, G2. Intuitively, this means that matching reduces the bias of age dierencesbetween populations the most whenever the control sub sample has patients who are as close
in age to the sample of the treated patients.
The expected value of the N largest values from the G2 sample of size rN is: 2 +
2(r,N). Since the maximum reduction in bias is dependent on how large 2 is, and 2smaximum depends on 2(r,N), the maximum percent reduction in bias is the ratio of this
value over the true dierences in the x variable between the populations. The maximum
value of is:
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 9
Using the lemma from above, we can algebraically manipulate this result to get max
terms of B:
This result is important because for a particular matching method, we can now compare
to the min(100, max
). That tells us how well a matching method obtains a G2 that hasan expected average of X that is close to that average in G1. If max is small, there is no
matching method that does this. If max
is large, most matching methods should perform
well. The special case where we can find parameters for max
such that max
100 impliesan existence of a matching method that obtains a 100 percent reduction in expected bias.
It is worth noting that max
is positively related to r,N and negatively related to B, 21
holding other variables constant. If a researcher wants to increase the max
, then he or she
can adjust r and N .
Now that we have derived this important metric by which to measure the eectiveness
of a matching method on a single X, it is natural to apply the same process on multiple
covariates. Following our example, there might indeed be bias in the estimator for the treat-
ment eect not just due to dierences in age between the control and treatment population,
but also due to systematic dierences in other variables including weight, genetic history,
lifestyle choices, etc.
3. MATCHING ON MULTIPLE COVARIATES
Now, the objective is to estimate the eect of a binary treatment variable on many
dependent variables. The population can still be split into those who receive the treatment
and those who do not. We will refer to P1 as the population of those given the treatment,
and P2 as the population of those not given the treatment. The challenge is the same as
it was with one X variable: the treatment assignment is not random. We will solve this in
the same way as before: by finding samples from P1 and P2 in which the distribution of X
are almost the same. X is a vector that includes p matching variables (before p = 1). For
example, if we are estimating the eect of a drug on reducing cholesterol levels, X might be
a vector consisting of age, weight, and average amount of hours spent exercising in a week.
We will assume for simplicitys sake that all elements of X are not categorical. (So now, age
is no longer a 1 for old and a 0 for young, but is a number).
The process for constructing sub samples is similar to before. The approach of this
section will follow Rubins 1976 paper . First, choose random samples G1 and G2 of size
N1 and N2 from P1 and P2 respectively, where N1 N2. Then record p matching variablesfor all individuals in G1 and G2. Using some matching method, find matched sub samples
10 NIKHITA LUTHRA
G1 and G2 of sizes N1 and N2, where G1 is chosen from G1 and G2 is chosen from G2.
One dierence that now arises in constructing the matched sub samples is that we want
to make sure that by matching samples to minimize the dierences in age, for example,
between the treated and control group, we dont increase the dierences in some other
variable, such as amount of hours spent exercising. Whatever matching method we use to
construct the subsamples must thus have a very special property: it should be equal percent
bias reducing (EPBR). The meaning of EPBR and the conditions under which a matching
method is EPBR is presented in the theorem below, summarizing Rubins discussion :
Theorem 5. If X is the vector of covariates, then let u1 be the finite mean vector for P1,
and u2 be the finite mean vector for P2. For example, u1 consists of the mean age, weight,
and average weekly exercise for the treatment population, and u2 consists of the mean age,
weight, and average weekly exercise for the control population. The true values for these
means are unknown.
be the expected mean vector of X in the sub samples Gi
for i = 1, 2. Thesevectors can be obtained by matching: given (i) fixed samples of sizes N1, N2, (ii) fixed
distributions of X in both P1 and P2, and (iii) a fixed matched method for obtaining sub
samples, repeating the process of randomly sampling and matching will result in the average
of the mean vectors of the matched sub samples converging to u1 and u2.We consider a matching method EPBR for X if (u1 u2) = (u1u2) where is a
constant. The interpretation of this is that the percent reduction in the biases of each of the
p matching variables is the same. If a matching method is not EPBR, then certain linear
functions of x increase the bias .
Why do we care about selecting a matching method that is EPBR? Looking at the
equation, (u1u2) = (u1u2), the left hand side represents the average mean imbalanceof the covariates in the sub samples and the right hand side represents the average mean
imbalance of the covariates in the populations. Directly stated, the EPBR property implies
that improving balance in the dierence in means on one variable also improves it on all
others (and their linear combinations) by a proportional amount . These matching rules
are the easiest to evaluate when the dependent variables can be any linear combinations
of the covariates, since there is only one particular percent reduction in bias of interest.
Rosenbaum and Rubin overviewed some main EPBR methods and their technicalities can
be found in their paper .
3.1. Percent reduction in bias with multiple covariates. Now that we have defined
what it means for a matching method to be EPBR, we are naturally interested in evaluating
how much matching has reduced the bias due to covariates in evaluating a treatment eect.
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 11
Section 3.2 follows the approach of Rubin . We will now define the percent reduction in
bias, which is how we evaluate dierent EPBR matching methods:
Definition 7. Percent reduction in bias for matching on multiple covariates:
= 100[1 (u1 u2)0
for any vector .
will dier based on the matching method, the distributions of X in the control and
treatment population, the sizes of the random samples and also of the sub samples. This
naturally leads us to the final result of this paper: the maximum percent reduction in bias
matching on multiple covariates using an EPBR method. Similar to the case with only one
X variable, the best case scenario of a given EPBR matching method is the min(100, max
The following theorem will define max
. The proof has been omitted because while the
algebra is untidy, the intuition is the same as the case when matching on one variable that
was presented in section 2. Essentially, the maximum percent reduction in bias is when
(i) the members of the randomly selected treatment sample G1 with the smallest expected
values of the covariates are chosen for the treatment sub sample G2 (ii) the members ofthe randomly selected control sample G2 with the largest expected values of the co variates
are chosen for the control sub sample G1. This minimizes the dierences between the two
sub samples. Similar to the situation when matching on one X, the proof also ends with
a substitution of B, the bias formula. If the reader wishes to see a formal proof, it can be
found in .
Theorem 6. Maximum percent reduction in bias
Given (a) fixed distributions of X in P1 and P2 with mean vectors u1 and u2 and
2, (b) fixed sample sizes of G1 and G2, N1 = r1N1 andN2 = r2N2, r1 1, r2 > 1, and (c) fixed sizes of G1 and G2, N1 and N2, the maximumpercent reduction in bias for any matching method that is EPBR for X is:
p(1 + 21/
"+2 (r2, N2)
21 (r1, N1)
0, the variance of the best linear discriminant with respect to the P2inner product in P
, = (u1 u2)P1
B = (12)/p
(21 + 22)/2, the number of standard deviations between the means
of X0 in P1 and P2, i = ui0,
+2 (r2, N2) = the expectation of the sample average of the N2 largest of the r2N2randomly chosen observations from F2, where F2 is the distribution of X
0 in P2
12 NIKHITA LUTHRA
normed to have zero mean and unit variance, i.e., the distribution of (X u2)0/2in P2, and
1 (r1, N1) = the expectation of the sample averages of the N1 smallest of r1N1randomly chosen observations from F1, F1 being the distribution of (Xu)0/1 inP1.
for a given EPBR matching method gives the same kind of information
as described in Section 2.4. First, we can observe that max
and B are inversely related. B
represents the systematic dierences between the populations due to the covariates. As this
bias increases, it becomes harder to make the sub samples similar and reduce the eects of
confounding variables. It is worth noting that B and 21/22 rely on parameters unknown to
the researcher but are easily estimated from the data.
Figure 4. Approximate ratio of sample sizes r2, needed to obtain a maximum
percent reduction in bias close to 100 percent 
Secondly, we can see that for a fixedN , as r increases, 2(r,N) increases, which increases
. Simultaneously, as r increases, 1(r,N) decreases, which also increases max. This is
useful for the researcher because he or she can increase the pool from which the sub samples
are selected. As the pool increases in size, the researcher is more likely to come across values
that make the samples better matched. Figure 4 shows what the ratio of G2 to G1 would
have to be in order to attain a maximum percent reduction in bias close to 100 percent for
dierent values of the total bias B and 21/22. As we can see, for the maximum value of B
and 21/22, the pool from which the control sub sample is chosen has to be 35 times the size
of the pool from which the treatment sub sample is chosen, while for the smallest values of
B and 21/22, it would only have to be 1.1 times the size.
3.2. Choosing a matching method. Whether we have a singleX or multipleX 0s to match
on, knowing the maximum percent reduction in bias allows us to evaluate how successful a
matching method is at achieving the goal: reducing bias from systematic dierences between
the control and treatment populations. It gives an anchor to the researcher to understand
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 13
how successful they were at limiting the confounding eects of covariates on estimating a
treatment eect. To see a concrete example, the results of a Monte Carlo simulation of the
Mahalanobis-metric matching methods percent reduction in bias of co variates X is shown
in Figure 5. Consistent with what we would have expected from the theory derived in this
paper, it is clear from the table the percent reduction in bias is the highest for low values of
bias B and 2 and high values of r.
Figure 5. Percent reduction in bias of X, Mahalanobis-metric matching,
N = 50, X normal, Monte Carlo values 
Monte Carlo results also help compare dierent matching methods to each other. An
example of the results of a real life simulation that compared two matching methods, dis-
criminant matching and metric matching, and the percent in bias reduced for three dierent
estimators is shown with varying ratios of the samples in Figure 6. From this table, we
can see an example of a situation in which metric matching definitely seems superior to
discriminant matching because it does a better job at reducing bias.
It is worth noting while the percent reduction in bias is definitely a prime consideration
when selecting a matching method, it is not the only one. In practice, dierent matching
methods have dierent trade os. A common matching method is mean matching, where
each sub sample is constructed so that the means in each sub sample are as similar as
possible. While this will have a high percent reduction in bias, practically, it can be hard.
Researchers usually have one shot at choosing members of their subsample, and the means of
14 NIKHITA LUTHRA
Figure 6. Percentage reduction in expected squared bias averaging over dis-
tributional conditions 
the sub samples are only known after individuals have been chosen. In real life, researchers
find it easier to choose pairs of subjects with similar covariates.
That leads to pair-wise matching, another common method. This is when members of
the treatment group are ordered from low to high on some covariates, and subsequently so
are the members of the control group. A pair is constructed by matching a member of the
treatment group with a member of the control group, taking the individuals with the lowest
covariate values in each group, respectively. Then another pair is constructed, with each
member having the second lowest values of each of their respective groups. This process is
repeated. The downside to this method if a researcher orders from low to high, for example,
then the members with high values for the covariates are left out of the sub samples.
As we can see from this high level discussion, there are many practical concerns for
researchers when selecting a matching method. There is an abundance of recent literature,
to which Rubin has contributed to, concerned with various classes of matching methods.
Propensity score analysis has recently received has received a particular amount of attention.
All of this being said, the percent reduction in bias and the maximum percent reduction in
bias are still the most prominent concerns in mind, since at the end of the day, the goal of
any matching method is to reduce bias.
In this paper, we have tackled how researchers in the social sciences produce estimators
for the eect of treatments on some outcome variable between two populations, the treated
and the controlled, which are assumed to be systematically dierent. These systematic dif-
ferences, whether it be in one X variable or many, bias the results of the treatment estimates.
As a result, it is impossible to infer whether the observed dierences in the outcome vari-
able are due to the treatment applied or these systematic dierences. Essentially, inferring
causality becomes very challenging.
QUANTIFYING THE HOPE FOR REDUCING BIAS IN THE SOCIAL SCIENCES 15
This paper summarized the process of matching, a tactic used by researchers to construct
sub samples for the control and treatment groups that are as similar as possible with respect
to the covariates. We walked through how to infer the treatment eect after matching, and
also produced a metric that evaluates the success of matching: the percent reduction in
bias. Finally, for any particular matching method, this paper derived the maximum possible
percent reduction in bias, max
is important because it represents the scope of social scientists to use
matching to reduce bias. In a sense, it almost gives a level of hope for causal inference.
Since many social sciences have struggled with identifying causality in the real world due to
research limitations, max
has a deep meaning attached to it- it gives us the potential of hope
that matching oers a chance, in a way, to save the social sciences. Perhaps now, patients
with other fatal diseases can feel as hopeful as those with Alzheimers that a treatment can
be found within our lifetime.
16 NIKHITA LUTHRA
1. Iacus, Stefano M; King, Gary; Porro, Giuseppe. (2011). Journal of the American Statistical Association,
2. Rosenbaum, Paul R; Rubin, Donald B. (1985). Constructing a Control Group Using Multivariate
Matched Sampling Methods That Incorporate the Propensity Score. The American Statistician, 39(1),
3. Rosenbaum, Paul R; Rubin, Donald B. (1985). The Bias Due to Incomplete Matching. Biometrics, 41,
4. Rubin, B. (1973). Matching to remove bias in observational studies. Biometrics, 29, 159-83.
5. Rubin, B. (1976). Multivariate Matching Methods That Are Equal Percent Bias Reducing, I: Some
Examples. Biometrics, 32, 109-120.
6. Rubin, Donald B. (1976). Multivariate Matching Methods That Are Equal Percent Bias Reducing, II:
Maximums on Bias Reduction for Fixed Sample Sizes. Biometrics, 32, 121-132.
7. Rubin, Donald B. (1979). Using multivariate Matched Sampling and Regression Adjustment to Control
Bias in Observational Studies. The Journal of the American Statistical Association, 74(366), 318-328.
8. Rubin, Donald B. (1980). Bias Reduction Using Mahalanobis-Metric Matching. Biometrics, 36, 293-298.