Date: Mon, 17 Nov 2008 17:12:08 -0500
Reply-To: Ian Whitlock <iw1sas@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <iw1sas@GMAIL.COM>
Subject: Re: Re-Ordering the Variables
Content-Type: text/plain; charset=ISO-8859-1
SL,
Comments embedded.
Hello Ian,
Thank you so much for the help with the Macro, Yes I am looking for a
macro very similar to what you have suggested. You have mentioned
some disadvantage about using View rather sas dataset. I know you have
already explained enough but I still did not understand what you mean
1) the reordering is done on the fly each time the view is used
I did not understand what you mean ??
***If you create a data set with a new order, it is done once at the
time the data set is created. With a view it is done each time the
view is made. Using the following code, I got a 1.07 second advantage
for the view.
%macro order ( data = &syslast, out =, list= ) ;
%let data = &data ; /* force evaluation of &data */
%if %length(&out) = 0 %then %let out = &data._v ;
data &out / view = &out ;
retain &list ;
set &data ;
run ;
%mend order ;
%macro mklist ( root=v, from=1, to=1 ) ;
%local i ;
%do i = &from %to &to ;
%do ; &root&i %end ;
%end ;
%mend ;
%macro timeds ( n=5, data=, timev=tds) ;
%local i begtm endtm ;
%let begtm = %sysfunc ( time() ) ;
%do i = 1 %to &n ;
data q ;
set &data ;
run ;
%end ;
%let endtm = %sysfunc ( time() ) ;
%let &timev= %sysevalf(&endtm - &begtm) ;
%mend timeds ;
%macro getdiff ( n ) ;
%local DStm DVtm ;
%timeds ( n=&n, data=w_ord , timev= DStm)
%timeds ( n=&n, data=w_v, timev=DVtm )
%put DVtm=&DVtm DStm=&dstm ;
%put cost of view for &n accesses to data is
%sysfunc(putn(&DVtm-&DStm,6.3)) ;
%mend getdiff ;
options nomprint ;
data w ( drop = i ) ;
retain v200-v1 5 ;
do i = 1 to 1e4 ;
output ;
end ;
put _all_ ;
run ;
%order ( list = %mklist(to=200) )
data w_ord ;
retain v1-v200 ;
set w ;
run ;
%getdiff(10)
*** Results may differ with number of variables and number of records,
and possibly with the rapidity of accesses. I have not done any
serious study of views versus data sets. The technology today looks
much better than it idid when views were first introduced.
2) a view doesn't have knowledge of the number of obs
Is this good or bad for me ??
*** It depends on whether you write code that needs access to this
number and uses the NOBS option to get it. If you never want the
information, then the fact that you cannot have it, has little
consequence.
3) the change in name can mess up down stream programs
What exactly do you mean ??
*** If there are one hundred programs written by 5 different people
that access PROJECT.MEMBER and you ask everyone to use
PROJECT.MEMBER_V instead and to never use the NOBS option, you might
get a strong reaction from them. If someone now has to rewrite the
project documentation to take account of the change, you could run
into more reactions.
*** All of the above depend on the programming environment and how
tightly it is controlled. I am just trying to give you ideas to
consider. You have to decide whether any are appropriate to your
situation.
What Exactly I am doing is that after deriving and mapping variables
coming from Oracle database to SDTM variables then I need to put them
in a particular order the way it is in the Document for that purpose I
am using RETAIN statement at very end of the program and trying to END
the data step. Right now everything is fine but tomorrow if some body
accidently does something in that dataset for any of the variables in
the RETAIN list, then their values would be retained ( I am scared
about the missing values getting the Previous values)
*** Since data from a SAS data set is automatically retained, I do not
see how explicitly retaining those variables can lead to problems. On
the other hand if you use RETAIN on external data then it can depend
on the processing code whether there will be a problem.
So I am looking for something would get me order and still not prone
to any changes when some body accidently does something.
*** The great freedom of SAS data sets is that for the most part code
need not depend on order and that it is very easy to change the order
of variables. Personally, I never want to depend on the order of
variables. I would guarantee the order with an explicit list,
possibly saved as a macro variable, used wherever that order is
essential. However, some might consider this an extreme view which
should be used in moderation.
thanks
SL
--
Ian Whitlock