LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2002, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 31 May 2002 08:48:01 -0700
Reply-To:     Paul choate <pchoate@GSOS.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Paul choate <pchoate@GSOS.NET>
Organization: http://groups.google.com/
Subject:      Re: Sums & Avgs of Variables in Dataset
Content-Type: text/plain; charset=ISO-8859-1

You may want to describe your situtaion a bit more clearly. I hope I'm not over-simplifying, but -

I'm not clear why you need proc transpose, I'd use it as a last resort. Usually whatever you need to do with your data can be done without transforming the dataset. In SAS you _usually_ use a datastep to operate on individual observations and procedures to operate across observations.

When working with large number of variables in a row I'd use an array statement, it allows you to operate on a large number of variables easily. There are implicit and explicit methods of referencing the variables. Generally, you define the array and then use a do-loop and subscripted array elements. The results are within one observation at a time.

The sum and average functions both allow variable lists also. sumvars=sum(of x1-x55) or avevars=mean(of x1--a23). Note the two styles of lists. Again this is on a single observation (row).

You need to be careful with how you handle missing values. The sum and average functions operate on non-missing values and ignore missing values. Arithmetic operators return a missing value when they operate on missing values.

Working across many observations I'd use proc means or summary (essentially the same procedure) to compute sums and averages. Again, be sure you know how SAS is handling your missing observations. There is a "missing" option. You can either sum and average across the whole dataset, or use "class" variables or sorted "by" variables to sum and mean on groups within your data. Note that you can define an output dataset in the procedure, and so you can run a dataset through multiple proc summary's to collapse it on one set of dimensions and then another set. This is very useful.

I would guess your problem can be solved with some operations in a datastep and then a subsequent proc summary. By using datsteps and proc summary's in combination you can "crunch" data in almost any way you need. Proc transpose would be more commonly used for restructuring a dataset in preparation to joining it to other data that is stored in a different structure.

Hope that helps.

pchoate@gsos.net

wpr <wpr@midsouth.rr.nospam.com> wrote in message news:<3CF6C2BB.29EF520E@midsouth.rr.nospam.com>... > I have a large dataset, about 90,000 rows and 200 variables. I need to > get the summation of some of the variables and the average of the > remaining variables, for each variable. > > Here's what I tried to do: > 1. Proc Transpose; to get the variable names into the field named _name_ > > 2. from the resulting dataset, create two more: one for fields to be > summed, the other for fields to be averaged, using the value in _name_ > for screening > > This didn't work! The log says something about columns and lines, but I > haven't a clue what this means. > > I have four datasets with this information that I need to do this for > and I do not want to manually enter the variable names (about 1,000). > > Does anyone have any ideas either how to get my Proc Transpose idea to > work or something else? > > Thanks very much.


Back to: Top of message | Previous page | Main SAS-L page