LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2009, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 20 Mar 2009 07:34:52 -0700
Reply-To:     Richard <rdevenezia@WILDBLUE.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Richard <rdevenezia@WILDBLUE.NET>
Organization: http://groups.google.com
Subject:      Re: proc summary help
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset=ISO-8859-1

On Mar 19, 4:10 pm, joh...@GMAIL.COM (Jake) wrote: > Dear SAS-L: > > I have the following data and would like > to select the max for measurement 1 (mea_1) > per subject per measurement date (mea_date), > and I would like to know if there is a > way to do this with proc summary, vs. the > datasets which follow the proc summary that > I currently have? I want to retain the id, > mea_1, mea_2 and mea_3 values in the final > dataset from proc summary. I seem to remember > doing something similar using idgroup, but > I now sure it that will work in this case. > > Any suggestions? > > Thanks! Jake > > data > measurements; > > format mea_1 mea_2 mea_3 3. mea_date > date9.; > length patient > $4; > > patient='0001'; > > mea_date='10Oct08'd; > > id=0; mea_1=40; mea_2=90; mea_3=60; > output; > id=1; mea_1=34; mea_2=98; mea_3=72; > output; > id=2; mea_1=68; mea_2=70; mea_3=50; > output; > > patient='0002'; > > mea_date='03Oct08'd; > > id=0; mea_1=50; mea_2=90; mea_3=88; > output; > id=1; mea_1=30; mea_2=98; mea_3=70; > output; > id=2; mea_1=52; mea_2=70; mea_3=72; > output; > id=3; mea_1=72; mea_2=90; mea_3=68; > output; > id=4; mea_1=30; mea_2=60; mea_3=74; > output; > > run; > > proc sort > data=measurements; > > by patient mea_date > id; > > run; > > proc summary data=measurements nway > missing; > where id ^= > 0; > > by patient > mea_date; > > class > id; > > var mea_1 mea_2 > mea_3; > > output out=mea_sum(where=(_type_=1) drop=_freq_) min= max=; > run; > > proc sort > data=mea_sum; > > by patient DESCENDING > mea_1; > run; > > data > need; > > set > mea_sum; > > by patient DESCENDING > mea_1; > if > first.patient; > > run; > > title 'This is what I > need'; > > proc print > data=need; > > run;

Jake:

Proc SUMMARY is used to compute statistics of a variable over more than one row.

Your SUMMARY code has statements by patient mea_date; class id;

Which will restrict the processing to a single row!

The code you show for NEED selects a patients row having the highest mea_1 (regardless of date).

A simple SQL query can perform this type of data slicing:

proc sql; create table NEED2 as select * from measurements group by patient having mea_1 = max(mea_1) ; quit;

The resultant can have multiple rows per patients if there are more than one row having the max mea_1 value.

-- Richard A. DeVenezia http://www.devenezia.com


Back to: Top of message | Previous page | Main SAS-L page