LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2000, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 13 Jul 2000 17:16:25 -0700
Reply-To:   Cassell.David@EPAMAIL.EPA.GOV
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "David L. Cassell" <Cassell.David@EPAMAIL.EPA.GOV>
Subject:   Re: Summarizing and choosing most populous ...
Comments:   To: DN7423@momail.sbc.com
Content-type:   text/plain; charset=us-ascii

David, you wrote: > I need to: > > 1.) Sum all rev for each id (the easy part). > 2.) Identify the most populous value of wc for each id. > 3.) Identify the most populous value of sic for each id. > 4.) Output one obs per id value.

You tried this with a DATA step. I'm sure that you'll see DATA step solutions and PROC SQL solutions, but there is a canned PROC which will do all this for you. PROC MEANS has a couple little-known options in its output statement: MAXID and MINID . PROC SUMMARY will do this too. I'll use the CLASS statement in my example so you can see that it can handle a dataset your size even if the input is unsorted. Given your DATA step:

proc summary data=a; var rev; class id; output out=out1 sum=sumrev maxid( rev(wc) rev(sic) ) = bigwc bigsic; run;

proc print noobs data=out1; var id sumrev bigwc bigsic; run;

That's all there is to it. Using a BY statement instead of a CLASS statement when the data are sorted on the by-variable will let you do this with a much larger number of distinct by-variables, since then SAS won't have to hold everything in memory.

David -- David Cassell, OAO Corp. Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page