Date: Thu, 4 Dec 2003 12:40:16 -0600
Reply-To: pudding man <pudding_man@MAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: pudding man <pudding_man@MAIL.COM>
Subject: Re: Aggregate
Content-Type: text/plain; charset="iso-8859-1"
----- Original Message -----
From: William Dudley <william.dudley@NURS.UTAH.EDU>
Date: Wed, 3 Dec 2003 14:42:03 -0700
To: SAS-L@LISTSERV.UGA.EDU
Subject: Aggregate
> I have a data set in which each person can have a variable
>number of records (number of records ranges from 100 to
>150). Records within an individual are ordered by two
>variables: day and time. Some of the fields are constant
>across records within individuals (e.g. age, birth weight
>etc). How can I create an aggregated data set in which each
>person is represented by one record that would contain data
>for the fields that are constant within the individual?
So you want one copy of records with duplicate values
in specified variables (id, age, birth, weight)
for a given individual? I won't ask how you wound up
with 100-150 recs/individual ...
Quentin will no doubt forgive me for commandeering pieces
of his code/data. I doubt the SQL is the least bit elegant,
but it might suggest an approach:
data a;
input id day time age birth mmddyy10. weight;
format birth mmddyy10. time time.;
cards;
1 1 1 20 01/01/1990 118
1 1 2 20 01/01/1990 120
1 1 3 20 01/01/1990 120
1 1 2 20 01/01/1990 124
1 1 3 20 01/01/1990 124
1 2 1 20 01/01/1990 122
2 2 1 15 06/06/1995 60
2 3 1 15 06/06/1995 60
2 4 2 15 06/06/1995 61
2 4 2 15 06/06/1995 62
;
run;
proc sql;
create view av as
select *, count(*) as ct
from a
group by id, age, birth, weight
;
create table b as
select distinct id, age, birth, weight
from av
group by id
having ct = max(ct)
; quit;
proc print data = b; run;
What if there were 2 or more groups of recs with constant
values for an individual? If the frequency of recs in the
groups varies, the above code should get you a rec from the
most frequent group. If the frequency is the same (say, 4
recs for each of 2 sets of constants), you may want to post-
process the data, possibly sorting by ID, AGE, BIRTH, WEIGHT
and conceivably using FIRST.WEIGHT or LAST.WEIGHT or
somesuch (to get only 1 rec / individual). This to be done
according to criteria known *only* to you . <g>
Skoal,
Puddin'
*******************************************************
*** Puddin' Man *** Pudding_Man@mail.com ********
*******************************************************;
You Better Make It To The Woods,
If You Can ..."
- from "Bootleggers Blues", Mississippi Sheiks, maybe 1930
--
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm