LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2002, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 2 May 2002 16:13:12 -0700
Reply-To:     Sigurd Wilson Hermansen <hermans1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Wilson Hermansen <hermans1@WESTAT.COM>
Organization: http://groups.google.com/
Subject:      Re: An SQL question.
Content-Type: text/plain; charset=ISO-8859-1

Others have pointed you in the right direction. Let me continue a bit farther.

Each distinct value of familyid becomes a group. As a first step, the program counts the number of rows containing workhrs *values* within each group. Then it expands each row of that result to create a different row for each instance of a workhrs value.

If you have familyid workhrs 1 3 1 2 2 10 2 1

the SQL compiler first generates

familyid nworkhrs 1 2 2 2

in the background, and then '..remerges summary statistics back to the original data ..' to produce this result:

familyid workhrs nworkhrs 1 3 2 1 2 2 2 10 2 2 1 2

This procedure often produces mistaken solutions, so the note serves as a warning. In this case you probably want to sum rather than count workhrs for each familyid. In that case, you should select familyid, sum(workhrs) as nworkhrs. That removes the dilemma of what to do with multiple values of workhrs per group.

If you really do need a workhr value for each group, you would probably want to limit it to one. In earlier posts I have illustrated how to select the minimum or maximum values (assuming no multiple minima or maxima). It gets more complicated if you do have multiples. Sig

nnair@acf.dhhs.gov (Neal K Nair) wrote in message news:<vines.SGH8+LOJowA@c11wdc-0020b.acf.dhhs.gov>... > Hi SAS-Lers > > I have an SQL program as follows: > > proc sql; > crate table three as > select familyid, workhrs,count(*) as nworkhrs > from two > group by familyid; > quit; > > [data set named 'two' contains the variables 'familyid' and 'workhrs', sorted > by familyid and by descending order of workhrs] > > I get a NOTE in the SAS log: "The querry requires remerging summary > statistics back with the original data". > I do not understand the NOTE. Could someone explain it to me, please? > > Thanks. > > Neal


Back to: Top of message | Previous page | Main SAS-L page