Date: Wed, 25 Apr 2007 22:50:39 -0400
Reply-To: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject: Re: OT: Chance to Make SAS-L History: Did You Know That...
Content-Type: text/plain; charset=ISO-8859-1
On Wed, 25 Apr 2007 14:52:10 -0400, Paul St Louis <pstloui@DOT.STATE.TX.US>
wrote:
>Did you know that you can subset obs using Proc Format? Who woulda thunk.
>
>Data set SMALL
>Obs ID
>1 2
>2 4
>3 5
>
>Data set LARGE
>Obs ID N
>1 1 1
>2 2 1
>3 2 2
>4 3 1
>5 3 2
>6 3 3
>7 4 1
>8 4 2
>9 4 3
>
>Use PROC FORMAT to create a temporary format that assigns all ID values in
>data set SMALL the same label. Then use a WHERE statement to subset
>observations from data set LARGE where the formatted value of ID equals
>that label. The format serves, in effect, as a “lookup table.” For example:
>
>proc format;
>value $key '2','4','5' = 'Y'
>other = 'N' ;
>run;
Even better, use the CNTLIN= option to create the format from SMALL, instead
of manually transcribing values into a VALUE statement.
>
>data subset;
>set large;
>where put(id,$key.) = 'Y' ;
>run;
>
>Use PROC FORMAT to create a temporary character format named $KEY. The
>values ‘2’, ‘4’, and ‘5’ will be formatted as ‘Y’. All other values will be
>formatted as ‘N’.
>
>Create data set SUBSET with the DATA statement.
>
>Read observations from data set LARGE with the SET statement.
>Load only those observations where the value of ID, formatted with $KEY.
>via the PUT function, equals ‘Y’ with the WHERE statement. Observations
>with ID values ‘2’, ‘4’, and ‘5’ are written to data set SUBSET.
>
>Excerpt from
>Subsetting Observations from Large SAS® Data Sets
>Christopher J. Bost, MDRC, New York, NY
>NESUG 2006
|