Date: Fri, 23 Sep 2005 11:43:48 +0100
Reply-To: Alice Sullivan <A.Sullivan@ioe.ac.uk>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Alice Sullivan <A.Sullivan@ioe.ac.uk>
Subject: Propensity score matching
Content-Type: text/plain; charset="us-ascii"
Hi,
I am doing an analysis of academic outcomes for children in private and
state schools, and am trying to use the propensity score approach.
I want to match on the exact propensity score, dropping unmatched cases
from the sample. I did try the binning approach, but since my dataset is
large (more than 10,000 cases), it was impossible to balance the bins.
I have calculated the propensity score using 'save predicted values -
probabilities' in binary logistic regression, with the 'treatment'
(state/private school) as the dependent variable, and a set of
predictors (social class, etc), as follows:
LOGISTIC REGRESSION private
/METHOD = ENTER region3s faclas7m educatio famtrad kidno mobooks moint
Zabilit11 teacha_1 teachmiss abilmiss
/CONTRAST (region3s)=Indicator /CONTRAST (faclas7m)=Indicator
/CONTRAST (educatio)=Indicator /CONTRAST
(famtrad)=Indicator /CONTRAST (kidno)=Indicator /CONTRAST
(mobooks)=Indicator /CONTRAST (moint)=Indicator /CONTRAST
(abilmiss)=Indicator /CONTRAST (teachmiss)=Indicator
/SAVE = PRED
/CRITERIA = PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .
My problem is that the number of values I get from this is huge - it
exceeds 1000, so I can't even run a crosstabs. I can run a table of
frequencies, but it's too huge to print out.
My questions are:
1. Am I doing something wrong?
2. Is it acceptable to group the propensity scores together - e.g.
into percentiles or deciles, before dropping unmatched cases, or would
this defeat the object?
3. Has anyone written syntax to identify/drop unmatched cases?
(Doing it by hand is a daunting task with so many values!).
Many Thanks,
Alice
Dr. Alice Sullivan
Centre for Longitudinal Studies
Institute of Education
20 Bedford Way
LONDON
WC1H OAL