Date: Mon, 18 Dec 2006 21:37:27 -0800
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: storing numbers of observations in splitted files
In-Reply-To: <s586c6bd.036@wbs.warwick.ac.uk>
Content-Type: text/plain; format=flowed
Wing.Tham03@PHD.WBS.AC.UK wrote back:
>
>Hi Toby,
>
>
>I am sorry for not making this clear. I have three different files
>test1,2,3. I want to split them indexed by their file names and date. I
>also want to store their number of observations according to their types
>and dates. I have hard-coded them below. Since there are 252 days in my
>data, it will take a while for me to hardcode them. I am asking if there is
>a smarter way to store the number of observations while i was splitting the
>data according to their dates/dayid. Thanks.
>
>
>Wing Wah
The *smartest* thing to do is to NOT do this at all.
Convert the data sets to a single long-and-thin SAS data set
(not a text file that has to read in over and over). Do not
split them by days.
Now, if you have particular subsetting that you want to do,
index on those variables used for the subsets. Then you can
get very fast subsetting using WHERE clauses in your data
set options as you read in the data set for a DATA step or
any PROC step.
Also, sort your data in the standard order you will use the
most often. If there is no such order, then don't worry about
this. But having the data in order means that an extra sort
is not needed, and you can use SAS by-processing to do your
work much faster, much simpler, and much more efficiently.
Otherwise, you will have a nightmare. You say that you are
new to SAS. So how are you going to write all the complex
code needed to re-combine all the pieces you will need to
put back together? Every time you need to do this? It will
be miserable, and it will take way more of your time than you
can afford. So please, keep the data together in a meaningful
way, instead of splitting it up into agonizing splinters of data.
HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Fixing up the home? Live Search can help
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG