```Date: Tue, 7 Feb 2006 13:46:44 -0500 Reply-To: "Howard Schreier " Sender: "SAS(r) Discussion" From: "Howard Schreier " Subject: Re: dividing data into quintiles PROC RANK is designed to do handle this task, and it is part of Base SAS. On Tue, 7 Feb 2006 12:16:58 -0600, Jiann-Shiun Huang wrote: >Kateri: > > With Toby's reminder about possible ties, the following code takes >ties into consideration. The dataset is first sort in descending order >of the YourVar. All ties are assigned the higher quintile. If you want >to assign ties to the lower quintile, then sort in ascending order and >modify the codes accordingly. Write back if you have questions on >this. > >proc sort data=YourFile out=old; > by descending YourVar; >run; >data New; > set old nobs=ObsCount; > array bound(2:5) _temporary_; > retain bound(2:5); > select; > when (_N_ le 0.2*ObsCount) > do; > Quintile=5; > if 0.2*ObsCount le _N_ le 0.2*ObsCount+1 then >bound(5)=YourVar; > end; > when (0.2*ObsCount lt _N_ le 0.4*ObsCount) > do; > if YourVar eq bound(5) > then Quintile=5; > else Quintile=4; > if 0.4*ObsCount le _N_ le 0.4*ObsCount+1 then >bound(4)=YourVar; > end; > when (0.4*ObsCount lt _N_ le 0.6*ObsCount) > do; > if YourVar eq bound(4) > then Quintile=4; > else Quintile=3; > if 0.6*ObsCount le _N_ le 0.6*ObsCount+1 then >bound(3)=YourVar; > end; > when (0.6*ObsCount lt _N_ le 0.8*ObsCount) > do; > if YourVar eq bound(3) > then Quintile=3; > else Quintile=2; > if 0.8*ObsCount le _N_ le 0.8*ObsCount+1 then >bound(2)=YourVar; > end; > when (_N_ le ObsCount) > do; > if YourVar eq bound(2) > then Quintile=2; > else Quintile=1; > end; > end; >run; >proc print data=New; > var YourVar Quintile; >run; > > > >J S Huang >1-515-557-3987 >fax 1-515-557-2422 > >J S Huang >1-515-557-3987 >fax 1-515-557-2422 > >>>> "toby dunn" 2/7/2006 11:32:34 AM >>> >Jiann, > >What happens in your code when there is a tie or when you have some >value >that spans quantile, you have to shove it in one or the other quantile >as >they cant be split across quantiles. > > > >Toby Dunn > > > >>>> "Kateri Heydon" 2/7/2006 11:57:49 AM >>> >awesome. this worked. thank you!! >Kateri > >>>> "Jiann-Shiun Huang" 02/07/06 12:27 >PM >>> > The following code should work. Use ObsCount to count the toal >observations in YourFile. > >data New; > set YourFile nobs=ObsCount; > select; > when (0.8*ObsCount lt _N_ le ObsCount) Quintile=5; > when (0.6*ObsCount lt _N_ le 0.8*ObsCount) Quintile=4; > when (0.4*ObsCount lt _N_ le 0.6*ObsCount) Quintile=3; > when (0.2*ObsCount lt _N_ le 0.4*ObsCount) Quintile=2; > otherwise Quintile=1; > end; >run; >proc print data=New; >run; > > >J S Huang >1-515-557-3987 >fax 1-515-557-2422 > >>>> "Kateri H. Heydon" 2/7/2006 10:57:54 AM >>>> >Hi-- >I'm trying to divide my data into quintiles. I have sorted the >variable of >interest (a probability score for each observation), now how do I >create an >indicator variable 1-5 for each quintile??? > >Any input would be great!! ```

Back to: Top of message | Previous page | Main SAS-L page