Date: Sun, 16 May 2010 09:56:30 0400
ReplyTo: OR Stats <stats112@GMAIL.COM>
Sender: "SAS(r) Discussion" <SASL@LISTSERV.UGA.EDU>
From: OR Stats <stats112@GMAIL.COM>
Subject: Re: Average the Dups
InReplyTo: <AANLkTin4DOJgMRPQhqR4yaZ8S3zWW4c1yXqlWg1tekNj@mail.gmail.com>
ContentType: text/plain; charset=ISO88591
The challenge is it is a very large dataset in which it would be painful to
go thru every row to see where the dups occur. We would like the code
itself to detect everytime there is a dup row in the data and then average
the two rows; otherwise it just leaves the unique row on its own.
pseudo code;
/*my original dataset from outside of SAS called origtable*/
data mynewtable;
set origtable;
/*if there is a dup by id, take the average of the two rows*/
run;
What is the syntax for the condition above in the datastep? And/or what
would be the syntax for the above in PROC SQL. Again, I don't know the row
#'s of where the dups are occuring in the large origtable.
Thanks
On Sun, May 16, 2010 at 9:03 AM, Murphy Choy <goladin@gmail.com> wrote:
> Hi,
>
> I think PROC SQL is the preferred approach.
>
> Proc sql;
>
> create table result as select x as x, y as y,mean(x1) as x1,mean(y1) as y1
> from infodata group by x,y;
>
> quit;
>
>
>
> On Sun, May 16, 2010 at 8:57 PM, OR Stats <stats112@gmail.com> wrote:
>
>> Hello:
>>
>> I have a dataset that is
>>
>> id x y
>>
>> Every once in a while in the dataset there would be two consecutive rows
>> with the same unique id. In cases of these duplicates, I would like to
>> take
>> the average of the two rows. How can I do this in SAS? For example the
>> data would like this
>>
>> 1 x1 y1
>> 2 x2 y2
>> 3 x3a y3a
>> 3 x3b y3b
>> 4 x4 y4
>> .
>> .
>> .
>> 74 x74a y74a
>> 74 x74b y74b
>>
>> .
>> .
>> .
>> The final table would look like
>> 1 x1 y1
>> 2 x2 y2
>> 3 Average(x3a,x3b) Average(y3a,y3b)
>> 4 x4 y4
>> .
>> .
>> .
>> 74 Average(x74a,x74b) Average(y74a,y74b)
>>
>> .
>> .
>> .
>> Can we do this in a single datastep or PROC SQL? Which is faster?
>>
>
>
>
> 
> Regards,
> Murphy Choy
>
> Certified Advanced Programmer for SAS V9
> Certified Basic Programmer for SAS V9
> DataShaping Certified SAS Professional
>
