LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 1996, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 8 Mar 1996 14:20:08 -0600
Reply-To:     txplltw@UABCVSR.CVSR.UAB.EDU
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         Todd Weiss <txplltw@UABCVSR.CVSR.UAB.EDU>
Subject:      Re: Code Reduction
In-Reply-To:  <9603081845.AA48793@UABCVSR.cvsr.uab.edu>

Hello to all,

I would like to add some things about the code that I previously submitted to the list. I am not in the least offended by the replies, but the repliers should know the following:

A) I apologize for supplying a proc print of the dataset instead of the actual data, and for not giving a more thorough explanation of the problem. I am attempting to keep those individuals who have no infection and those having a first fungal, viral, protozoal, or bacterial infection.

B) I was hoping to find a way to make one pass through the data.

C) The code is not my handy work.

D) I am aware that MACRO language can be used to make the code more concise.

E) When I have some more time, I will try to send some data.

Thanks to all who have responded.

Todd

On Fri, 8 Mar 1996, Ian Whitlock wrote:

> Subject: Code Reduction > Summary: SAS code and later macro suggestions are made based on the > code given, rather than an understanding of underlying > problem which might point to better code than offered. > Respondent: Ian Whitlock <whitloi1@westat.com> > > Todd Weiss <txplltw@UABCVSR.CVSR.UAB.EDU> writes: > > >There maybe giggling and laughing when showing the following but > >this problem is a little trickier than it looks. I have 4 data > >sets(i.e. patfung patprot patvirus patbact) containing > >observations using the code below. Would anyone be willing to share > >a more parsimonious code solution for obtaining the same observations > >in these 4 data sets(kind and number) using either data step or sql. > > The original code is given without data at the bottom. > > There are several obvious ways to improve the code without trying to > understand the data. The code blocks off into 4 blocks creating the 4 > data sets mentioned. > > The same variables are on all these files. I strongly suspect that you > only want certain relevant variables in each file. Hence there ought > to be a KEEP= option limiting the variables. > > The next suggestion is in efficiency. The second and third steps can be > combined. The two steps accomplish getting either the first or second > record from a PATNUM group. I will add a flag WANTED to indicate a > record is wanted. When the record is chosen WANTED is set to 0 so that > no more from that block will be taken. This eliminates an extra data > pass in each block. > > I also changed to my style in order to understand what is > happening. Here is the code for the first block. > > > DATA phtsf; > SET phts ( keep = institut patnum retx infdate infect > int_fup fungus int_fung ) ; > by institut patnum retx infdate; > if infect = . then infect = 0; > if first.patnum then > do ; > if fungus = . then > do ; /* update first record */ > int_fung = int_fup; > fungus = 0; > end ; > end ; > else > if fungus = . then delete; > run ; > > data patfung ( keep = institut patnum retx infdate infect > fungus int_fung ) ; > retain wanted ; > set phtsf; > by institut patnum infdate; > if first.patnum = 1 then > do ; > wanted = 1 ; > if not last.patnum and fungus = 0 then delete; > end ; > if wanted ; > output ; > wanted = 0 ; > run ; > > In terms of pure SAS any further improvement would have to come from a > deeper understanding of the problem. After one has sufficient > understanding of SAS code, it is time to start learning macro. This is > a good place to begin because the code is repetitious. Essentially the > same thing is done four times. Let's put it in a macro so that we have > only one copy of code to be executed 4 times. This won't make it any > more efficient, but it will highlight the structure of what is being > done and minimize the amount of code. This type of macro code is very > simple because it is almost all pure SAS code. Only a few macro > variable references are required. > > %macro getdat ( out = patfung , /* output data set */ > var = fungus , /* test variable */ > assign = int_fung /* assign variable */ > ) ; > DATA temp ; > SET phts ( keep = institut patnum retx infdate infect > int_fup &var &assign ) ; > by institut patnum retx infdate; > if infect = . then infect = 0; > if first.patnum then > do ; > if &var = . then > do ; > &assign = int_fup; > &var = 0; > end ; > end ; > else > if &var =. then delete; > run ; > > data &out > ( keep = institut patnum retx infdate > infect &var &assign ) > ; > retain wanted ; > set temp ; > by institut patnum infdate; > if first.patnum = 1 then > do ; > wanted = 1 ; > if not last.patnum and &var = 0 then delete; > end ; > if wanted ; > output ; > wanted = 0 ; > run ; > %mend getdat ; > > Now your code reduces to > > options pagesize=80 linesize=132 notes obs=100; > libname buildinf '/mydir'; > filename maccode '......'; > > %inc maccode ; > > proc sort data=buildinf.phts(keep=institut patnum retx infdate > infect int_fung int_prot > int_vir int_bact fungus protozoa > bacteria virus int_fup ) > out=phts; > by institut patnum retx infdate; > run ; > > proc print data = phts n; > var institut patnum retx infdate infect int_fup fungus int_fung > protozoa int_prot virus int_vir bacteria int_bact; > run; > > %getdat ( out = patfung , var = fungus , assign = int_fung ) > %getdat ( out = patprot , var = protozoa, assign = int_prot ) > %getdat ( out = patvirus , var = virus , assign = int_vir ) > %getdat ( out = patbact , var = bacteria , assign = int_bact ) > > I used > > data phts ; > input institut patnum retx infdate infect > int_fup fungus int_fung ; > cards ; > 1 1 1 1 1 1 1 1 > 1 2 1 1 1 1 . . > 1 3 1 1 1 1 . . > 1 3 1 1 1 1 . . > run ; > > %getdat ( out = patfung , var = fungus , assign = int_fung ) > > to test the syntax of the macro. This is far from an exhaustive test. > > Ian Whitlock > ------------------------------------------------------------------- > options pagesize=80 linesize=132 notes obs=100; > > libname buildinf '/mydir'; > > proc sort data=buildinf.phts(keep=institut patnum retx infdate infect int_fung > int_prot > int_vir int_bact fungus protozoa bacteria > virus int_fup ) out=phts; > by institut patnum retx infdate; > proc print n; > var institut patnum retx infdate infect int_fup fungus int_fung > protozoa int_prot virus int_vir bacteria int_bact; > > run; > > DATA phtsf; SET phts; > by institut patnum retx infdate; > if infect = . then infect = 0; > if first.patnum ne 1 and fungus =. then delete; > if first.patnum = 1 and fungus =. then int_fung = int_fup; > if first.patnum = 1 and fungus =. then fungus = 0; > > data patient; set phtsf; > by institut patnum infdate; > if first.patnum = 1 and last.patnum=0 and fungus =0 then delete; > > data patfung; set patient; > by institut patnum infdate; > if first.patnum = 1; > > > DATA phtsp; SET phts; > by institut patnum retx infdate; > if infect = . then infect = 0; > if first.patnum ne 1 and protozoa=. then delete; > if first.patnum = 1 and protozoa=. then int_prot = int_fup; > if first.patnum = 1 and protozoa=. then protozoa = 0; > > > data patient; set phtsp; > by institut patnum infdate; > if first.patnum = 1 and last.patnum=0 and protozoa=0 then delete; > > data patprot; set patient; > by institut patnum infdate; > if first.patnum = 1; > > > DATA phtsv; SET phts; > by institut patnum retx infdate; > if infect = . then infect = 0; > if first.patnum ne 1 and virus =. then delete; > if first.patnum = 1 and virus =. then int_vir = int_fup; > if first.patnum = 1 and virus =. then virus = 0; > > > data patient; set phtsv; > by institut patnum infdate; > if first.patnum = 1 and last.patnum=0 and virus =0 then delete; > > data patvirus; set patient; > by institut patnum infdate; > if first.patnum = 1; > > > > DATA phtsb; SET phts; > by institut patnum retx infdate; > if infect = . then infect = 0; > if first.patnum ne 1 and bacteria=. then delete; > if first.patnum = 1 and bacteria=. then int_bact = int_fup; > if first.patnum = 1 and bacteria=. then bacteria = 0; > > > data patient; set phtsb; > by institut patnum infdate; > if first.patnum = 1 and last.patnum=0 and bacteria=0 then delete; > > data patbact; set patient; > by institut patnum infdate; > if first.patnum = 1; > run; >


Back to: Top of message | Previous page | Main SAS-L page