Date: Fri, 4 Jul 2008 15:08:17 +0000
Reply-To: iw1junk@COMCAST.NET
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <iw1junk@COMCAST.NET>
Subject: Re: 'foreach' in SAS
Summary: A classic programming pattern and why it persists.
#iw-value=1
Trish wants count on all observations, but it is only on the first
observation in the merge. This is a standard pattern - retain the
value when you have it (I use sv<var> for the name) and assign it when
you don't.
I have shoved some of the lines around and fixed dthe program.
data v_gb ( drop = svcount ) ;
retain svcount ;
merge gb (in=t) t_gb ;
by hospid;
if t;
if first.hospid then do;
flag=0;
HospCt=1;
svcount = count ;
end;
else
do ;
HospCt=0;
count = svcount ;
end ;
flag+1;
gb_vol = .;
if count le 125 then gb_vol = 1;
else if count gt 125 then gb_vol = 2;
else if count = . then gb_vol = .;
run;
Note that I removed the double quotes around the dots in last
executable line. They cause a conversion message. It is a good habit
to not allow conversion messages in the log since they indicate sloppy
programming.
Now why did COUNT get lost? One reason is that it was on the first
observation for each hospital in GB or T_GB but not both, and not on
the subsequent observations, i.e. the mistake is in earlier code that
created one of these data sets.
I think the more likely and only other possibility is a more classic
reason:
1) GB has multiple records
2) COUNT is missing on all observations of GB
3) T_GB also has COUNT with legitimate count
4) T_GB has only one record per hospital
In this case the GB count is clobbered by the correct T_GB count on
the first record and not on any subsequent records. If this is the
case the code can be simplified to
merge gb (in=t drop=count) t_gb ;
by hospid;
in your original code.
Since it is a classic problem, although it might not be in this case,
it is good to understand why the problem persists.
1) It started early in SAS history as badge of knowledge to know
that the data set on the right can clobber data from the left.
2) It has been perpetuated as a standard SAS programming competency
question.
3) It has been perpetuated as a standard SAS programming job
interview question.
Since it is so deeply burried in the culture of SAS programming, I
suspect the only way it will disappear is if the compiler refuses to
merge whenever the order of the data sets is relevant. Failing that
the truly wise programmer will choose to enforce this rule in his own
programming and take every opportunity in an interview situation to
ask, "When doesn't the variable on the right overrule the value of the
variable on the left?"
Ian Whitlock
==============
Date: Thu, 3 Jul 2008 11:16:31 -0400
Reply-To: Trish Bous <tboussard@GMAIL.COM>
Sender: "SAS(r) Discussion"
From: Trish Bous <tboussard@GMAIL.COM>
Subject: Re: 'foreach' in SAS
sorry, i did cut the code short. I never initialize count, could that
be the problem?
data v_gb; merge gb (in=t) t_gb ; by hospid;
if t;
gb_vol = .; if count le 125 then gb_vol = 1; else if count gt 125 then
gb_vol = 2; else if count = "." then gb_vol = ".";
if first.hospid then do; flag=0; end; flag+1;
*get count of hospitals; if first.hospid then HospCt=1; else HospCt=0;
run;
Thanks for the help!