Date:         Mon, 22 May 2006 23:13:35 -0700
Reply-To:     David L Cassell <davidlcassell@MSN.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         David L Cassell <davidlcassell@MSN.COM>
Subject:      Re: random delet from subgroups
In-Reply-To:  <>
Content-Type: text/plain; format=flowed

ei-wen_chang@CTB.COM wrote: >I have to random delete some observations in subgroups from a master data >(STD). The rules are like following. > >If form='A1', grade=2, ethnicity=H or I then random delete 5 cases from >this subgroup. >If form='A1', grade=3, ethnicity=B then random delete 9 cases from this >subgroup. >If form='A3', grade=2, then random delete 8 cases from this subgroup. (no >target ethnicity) >If form='A3', grade=7, and ethnicity='A' then random delete 6 cases from >this subgroup. > >There are more form, grade, and ethnicity combinations that needs to be >excluded, so I don't want to hard code. If anyone can give me some >advices, I will appreciate your help. Following are my sample data. > >data excluse; >infile cards missover; >input @1 form $2. @4 grade @6 ethnicity $2. @9 del_n; >cards; >A1 2 HI 5 >A1 3 B 9 >A3 2 8 >A3 7 A 6 >; > > >Data std; >do i=1 to 20; > form='A1'; > grade=2; > ethnicity='H'; > output; > form='A1'; > grade=2; > ethnicity='I'; > output; > form='A1'; > grade=3; > ethnicity='B'; > output; > form='A1'; > grade=3; > ethnicity='I'; > output; > form='A3'; > grade=2; > ethnicity='A'; > output; > form='A3'; > grade=2; > ethnicity='I'; > output; > form='A3'; > grade=7; > ethnicity='A'; > output; > form='A3'; > grade=7; > ethnicity='A'; > output; > form='A3'; > grade=7; > ethnicity='B'; > output; >end; >run; > > >Ei-Wen


Why do you need to do 'random deletions' of certain sizes from particular cross-classifications of the data?

It seems to me that - if you know the size of each cross-classification and the number of records to delete in that category, then you have a sampling problem. And you could solve your headache by using PROC SURVEYSELECT instead of trying to do the randomization and reduction on your own.

If you write back to SAS-L and explain in more detail what you are trying to do, and why, then someone ought to be able to help you better. Perhaps, by showing you how to do this in PROC SURVEYSELECT using the information you already have.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

