Date: Thu, 20 Oct 2005 12:16:37 -0700
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: 1:4 matching for Proc SurveySelect
Content-Type: text/plain; format=flowed
>Below reference is the following paper: Simplified Matched Case-Control
>Sampling PROC SURVEYSELECT
>Now, I want to create a matched dataset just as they do in this paper
>based on a number of matching criteria but I want the match to be 1
>case to 4 controls rather than the 1 to 1 match that they did in this
>paper. I assume this is fairly easy to do but I am not sure where to
First of all, I'm not sure I'm happy with the way the paper authors built
index for doing PROC SURVEYSELECT. They took all the variables they
wanted to stratify on (yes, they're just doing stratified sampling) and
concatenated them into one index. This is *NOT* needed with PROC
SURVEYSELECT, and could cause problems if one or more of the variables
is numeric, or if one or more variables needed to be binned first, or if
the index was long enough that it got truncated so key features were
lost. I would just use the string of (binned) variables.
Second, just start as they do, with a data set of values of the
variables, and the counts you want for the cases in each category. These
counts you want are your _NSIZE_ variable. Now use PROC SURVEYSELECT
to draw the sample for the cases, as they do.
Then take that counts-for-each-stratum data set (lets call it CASECOUNTS)
_nsize_ = 4*_nsize_;
That's it. Now you have a data set which holds the sample sizes for your
selection of controls.
David L. Cassell
3115 NW Norwood Pl.
Corvallis OR 97330
Donít just search. Find. Check out the new MSN Search!