LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2006, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 6 Oct 2006 13:22:58 -0700
Reply-To:   David L Cassell <davidlcassell@MSN.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   David L Cassell <davidlcassell@MSN.COM>
Subject:   Re: Dataset transform
In-Reply-To:   <200610061632.k96FvKPG012530@mailgw.cc.uga.edu>
Content-Type:   text/plain; format=flowed

rajasekhargo@YAHOO.COM wrote: > >Dear SAS Users: > >I am trying to transform a dataset in the following way. > >My input table is like this: > >ID date num1 num2 num3 >A 1 3 5 4 >B 1 8 3 2 >C 1 2 6 7 >A 2 4 8 9 >B 2 9 0 1 >C 2 0 2 3 >A 3 2 3 4 >B 3 8 6 7 >C 3 6 4 5 > >The output I want is like this. The variable names become >oldvarname_datevalue > >ID num1_1 num1_2 num1_3 num2_1 num2_2 num2_3 num3_1 num3_2 num3_3 >A 3 4 2 5 8 3 4 9 4 >B 8 9 8 3 0 6 2 1 7 >C 2 0 6 6 2 4 7 3 5 > >One solution might be to use Proc Transpose. But my input table is very >large (about 1/4 million records and hundreds of variables). So I am >concerned that any procedures that work from disk access might be slower. >Moreover, using proc transpose, can we create variable names like I >described above? > >I considered using hash tables (storing the data from the input dataset in >memory and writing it out to an output dataset). But I couldn't figure out >how to transform the shape of the o/p dataset. > >If there is any other solution, that is also welcome (like >multidimensional arrays?). > >I appreciate your assistance. > >Thanks, >Raj

First, please re-consider doing this. What you are asking for is a really bad database design that will make later analyses much harder and make data steps much more complex. If you have these data already in temproal order, then you ought to be leaving the numbers as is in order to use them properly in later SAS time series analyses. If you only want the numbers like this for *display* purposes, look into PROC TABULATE and PROC REPORT to string variable levels across the top of your table.

Second, since the data have to be read once from disk no matter what, I think that PROC TRANSPOSE would be as fast as anything else. Since it can take advantage of your by-variables, it will also be smart enough not to need all your RAM either.

Third, if you want complete control over your variable names, then a data step may be the way to go. I don't see that you need a hash solution, but you could try one.

And finally, please don't do this. If you have a really good reason, then you can use some of the code that has already been given to you. But most of the time that people ask this question, they do NOT have a really good reason to do it.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ Share your special moments by uploading 500 photos per month to Windows Live Spaces http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://www.get.live.com/spaces/features


Back to: Top of message | Previous page | Main SAS-L page