LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 1998, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 6 Feb 1998 21:16:08 -0000
Reply-To:     Jeff Tomlinson <Jeff@KITTSOFT.DEMON.CO.UK>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Jeff Tomlinson <Jeff@KITTSOFT.DEMON.CO.UK>
Subject:      Re: Mean of Variable across all Observations
Content-Type: text/plain; charset="us-ascii"

And the winner is ? So far ...

Using a non-trivial number of records (760000) - SAS 6.12/Win 95.

1. James - Size & elegance (1min 25sec) 2. Ron - Speed (1min 23sec) 3. Steve - One warning message and an extra variable (1min 59sec)

A slightly quicker version of Ron's (at least on my system - 1min 10sec) is

>> proc sort data=your.data; >> by year; >> run; >> >> proc summary data=your.data nway; >> BY year; >> var variable; >> output out=summary(drop=_type_ _freq_) mean=mean; >> run; >> >> data your.newdata(drop=mean); >> merge your.data summary; >> by year; >> newvar = variable - mean; >> run;

The BY vs CLASS police action isn't quite a fierce as the DATA STEP vs SQL war, but there are differing views. SAS recommend using CLASS, but if the data is sorted by the variable of interest, then using BY seems reasonable.

---- Jeff

-----Original Message----- From: James Yang <james.yang@HIGHMARK.COM> Newsgroups: bit.listserv.sas-l To: SAS-L@VM.MARIST.EDU <SAS-L@VM.MARIST.EDU> Date: 06 February 1998 19:04 Subject: Re: Mean of Variable across all Observations

>Martin@ADDER.DEMON.COSPAMSPAMSPAM.UK on 02/03/98 05:21:39 PM > >Please respond to Martin@ADDER.DEMON.COSPAMSPAMSPAM.UK > >To: SAS-L@UGA.CC.UGA.EDU >cc: (bcc: James Yang/SAM/CORP/Highmark) >Subject: Re: Mean of Variable across all Observations > > > > >Ron Coleman wrote: >> David, >> >> This one will have to be a two step process (less the sorts). First find >> the mean for each year and then merge it back to the original value and >> compute the difference. >> >> proc summary data=your.data nway; >> class year; >> var variable; >> output out=summary mean=mean; >> run; >> >> proc sort data=your.data; >> by year; >> run; >> >> data your.newdata(drop=mean); >> merge your.data summary; >> by year; >> newvar = variable - mean; >> run; >> >> proc sort data=your.newdata; >> by country year; >> run; >> >> -- >> Ron Coleman A SAS Quality Partner. >> Links Analytical, Inc. Linking your data to your business! >> 3545-1 St. Johns Bluff Rd. Suite 300 >> Jacksonville FL 32224 mailto:rcoleman@worldnet.att.net >> (904) 641-4766 > > > > >Seems like a lot of typing Ron! >If the data is stored in a data set called info, how about: >proc sql; > create table diffs as select *,var-mean as diff from > info, > (select mean(var) as mean, year from info group by year) as mean > where mean.year=info.year; > > >There, much better! Only used two semi-colons! >The thinking is the same though of course. Firstly you find the means >for each year then merge back with the original data by year. Then take >the difference. Easy. It does give you the extra variable called MEAN >but I could live with that. The code could be changed to exclude it. >If anybody want to mail me to say that it's too complicated and can't be >maintained either get some training or get a different job. > >:) > >Steve Adderson >SAS Consultant >www.adder.demon.co.uk >Currently unavailable but always open to offers! > > > >How about this: > >proc sql; > create table diffs as > select *, var-mean(var) as diff > from info > group by year; > >After the means are calculated by year, they will be remerged back with the >original table info and calculate var-mean(var). > >James Yang >Health Data Analyst >Highmark BCBS >www.cs.wvu.edu/~jyang


Back to: Top of message | Previous page | Main SAS-L page