LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (October 2005, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 20 Oct 2005 12:16:37 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: 1:4 matching for Proc SurveySelect
In-Reply-To:  <200510192010.j9JJqepm018352@malibu.cc.uga.edu>
Content-Type: text/plain; format=flowed

apavluck@GMAIL.COM wrote: >Below reference is the following paper: Simplified Matched Case-Control >Sampling PROC SURVEYSELECT >(http://www2.sas.com/proceedings/sugi29/209-29.pdf) > >Now, I want to create a matched dataset just as they do in this paper >based on a number of matching criteria but I want the match to be 1 >case to 4 controls rather than the 1 to 1 match that they did in this >paper. I assume this is fairly easy to do but I am not sure where to >start.

First of all, I'm not sure I'm happy with the way the paper authors built their index for doing PROC SURVEYSELECT. They took all the variables they wanted to stratify on (yes, they're just doing stratified sampling) and concatenated them into one index. This is *NOT* needed with PROC SURVEYSELECT, and could cause problems if one or more of the variables is numeric, or if one or more variables needed to be binned first, or if the index was long enough that it got truncated so key features were lost. I would just use the string of (binned) variables.

Second, just start as they do, with a data set of values of the stratification variables, and the counts you want for the cases in each category. These counts you want are your _NSIZE_ variable. Now use PROC SURVEYSELECT to draw the sample for the cases, as they do.

Then take that counts-for-each-stratum data set (lets call it CASECOUNTS) and do this:

data controlcounts; set casecounts; _nsize_ = 4*_nsize_; run;

That's it. Now you have a data set which holds the sample sizes for your selection of controls.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Donít just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/


Back to: Top of message | Previous page | Main SAS-L page