LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2008, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 4 Jul 2008 15:08:17 +0000
Reply-To:     iw1junk@COMCAST.NET
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Ian Whitlock <iw1junk@COMCAST.NET>
Subject:      Re: 'foreach' in SAS
Comments: cc: Trish Bous <tboussard@GMAIL.COM>

Summary: A classic programming pattern and why it persists. #iw-value=1

Trish wants count on all observations, but it is only on the first observation in the merge. This is a standard pattern - retain the value when you have it (I use sv<var> for the name) and assign it when you don't.

I have shoved some of the lines around and fixed dthe program.

data v_gb ( drop = svcount ) ; retain svcount ; merge gb (in=t) t_gb ; by hospid;

if t;

if first.hospid then do; flag=0; HospCt=1; svcount = count ; end; else do ; HospCt=0; count = svcount ; end ;

flag+1;

gb_vol = .; if count le 125 then gb_vol = 1; else if count gt 125 then gb_vol = 2; else if count = . then gb_vol = .;

run;

Note that I removed the double quotes around the dots in last executable line. They cause a conversion message. It is a good habit to not allow conversion messages in the log since they indicate sloppy programming.

Now why did COUNT get lost? One reason is that it was on the first observation for each hospital in GB or T_GB but not both, and not on the subsequent observations, i.e. the mistake is in earlier code that created one of these data sets.

I think the more likely and only other possibility is a more classic reason:

1) GB has multiple records 2) COUNT is missing on all observations of GB 3) T_GB also has COUNT with legitimate count 4) T_GB has only one record per hospital

In this case the GB count is clobbered by the correct T_GB count on the first record and not on any subsequent records. If this is the case the code can be simplified to

merge gb (in=t drop=count) t_gb ; by hospid;

in your original code.

Since it is a classic problem, although it might not be in this case, it is good to understand why the problem persists.

1) It started early in SAS history as badge of knowledge to know that the data set on the right can clobber data from the left. 2) It has been perpetuated as a standard SAS programming competency question. 3) It has been perpetuated as a standard SAS programming job interview question.

Since it is so deeply burried in the culture of SAS programming, I suspect the only way it will disappear is if the compiler refuses to merge whenever the order of the data sets is relevant. Failing that the truly wise programmer will choose to enforce this rule in his own programming and take every opportunity in an interview situation to ask, "When doesn't the variable on the right overrule the value of the variable on the left?"

Ian Whitlock ==============

Date: Thu, 3 Jul 2008 11:16:31 -0400 Reply-To: Trish Bous <tboussard@GMAIL.COM> Sender: "SAS(r) Discussion" From: Trish Bous <tboussard@GMAIL.COM> Subject: Re: 'foreach' in SAS

sorry, i did cut the code short. I never initialize count, could that be the problem?

data v_gb; merge gb (in=t) t_gb ; by hospid;

if t;

gb_vol = .; if count le 125 then gb_vol = 1; else if count gt 125 then gb_vol = 2; else if count = "." then gb_vol = ".";

if first.hospid then do; flag=0; end; flag+1;

*get count of hospitals; if first.hospid then HospCt=1; else HospCt=0; run;

Thanks for the help!


Back to: Top of message | Previous page | Main SAS-L page