LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Sun, 16 May 2010 14:40:07 -0400
Reply-To:   OR Stats <stats112@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   OR Stats <stats112@GMAIL.COM>
Subject:   Re: Average the Dups
Comments:   To: "Zack, Matthew M. (CDC/ONDIEH/NCCDPHP)" <mmz1@cdc.gov>
In-Reply-To:   <2529A281D3499446A3C931EE75C2942501D6F580@LTA3VS012.ees.hhs.gov>
Content-Type:   text/plain; charset=ISO-8859-1

Hello:

I do like how this is more parsimonious and will try it for averaging the dups. In the future, if we need to change how we calculate the X & Y for our dup rows.... we may not be using PROC Summary.... What would be the easiest (safest) way of defining our own function for when a dup occurs in the original database?

Thank you!

On Sun, May 16, 2010 at 11:27 AM, Zack, Matthew M. (CDC/ONDIEH/NCCDPHP) < mmz1@cdc.gov> wrote:

> Why not use a procedure like PROC SUMMARY instead of DATA step solutions > that may be > error-prone? > > ======================================================================== > ================= > proc sort; > by id; > run; > > proc summary nway; > by id; > var x y; > output out=avvars(keep=id xmean ymean) > mean=xmean ymean; > run; > ======================================================================== > ================== > > Matthew Zack > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of OR > Stats > Sent: Sunday, May 16, 2010 8:58 AM > To: SAS-L@LISTSERV.UGA.EDU > Subject: Average the Dups > > Hello: > > I have a dataset that is > > id x y > > Every once in a while in the dataset there would be two consecutive rows > with the same unique id. In cases of these duplicates, I would like to > take > the average of the two rows. How can I do this in SAS? For example the > data would like this > > 1 x1 y1 > 2 x2 y2 > 3 x3a y3a > 3 x3b y3b > 4 x4 y4 > . > . > . > 74 x74a y74a > 74 x74b y74b > > . > . > . > The final table would look like > 1 x1 y1 > 2 x2 y2 > 3 Average(x3a,x3b) Average(y3a,y3b) > 4 x4 y4 > . > . > . > 74 Average(x74a,x74b) Average(y74a,y74b) > > . > . > . > Can we do this in a single datastep or PROC SQL? Which is faster? > > >


Back to: Top of message | Previous page | Main SAS-L page