Date:         Fri, 23 Sep 2005 11:43:48 +0100
Reply-To:     Alice Sullivan <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Alice Sullivan <>
Subject:      Propensity score matching
I am doing an analysis of academic outcomes for children in private and state schools, and am trying to use the propensity score approach.

I want to match on the exact propensity score, dropping unmatched cases from the sample. I did try the binning approach, but since my dataset is large (more than 10,000 cases), it was impossible to balance the bins.

I have calculated the propensity score using 'save predicted values - probabilities' in binary logistic regression, with the 'treatment' (state/private school) as the dependent variable, and a set of predictors (social class, etc), as follows:


/METHOD = ENTER region3s faclas7m educatio famtrad kidno mobooks moint Zabilit11 teacha_1 teachmiss abilmiss

/CONTRAST (region3s)=Indicator /CONTRAST (faclas7m)=Indicator /CONTRAST (educatio)=Indicator /CONTRAST

(famtrad)=Indicator /CONTRAST (kidno)=Indicator /CONTRAST (mobooks)=Indicator /CONTRAST (moint)=Indicator /CONTRAST

(abilmiss)=Indicator /CONTRAST (teachmiss)=Indicator


/CRITERIA = PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .

My problem is that the number of values I get from this is huge - it exceeds 1000, so I can't even run a crosstabs. I can run a table of frequencies, but it's too huge to print out.

My questions are:

1. Am I doing something wrong? 2. Is it acceptable to group the propensity scores together - e.g. into percentiles or deciles, before dropping unmatched cases, or would this defeat the object? 3. Has anyone written syntax to identify/drop unmatched cases? (Doing it by hand is a daunting task with so many values!).

Many Thanks,


Dr. Alice Sullivan

Centre for Longitudinal Studies

Institute of Education

20 Bedford Way



