Date: Fri, 7 May 2004 17:48:10 -0400
Reply-To: Quentin McMullen <quentin_mcmullen@BROWN.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Quentin McMullen <quentin_mcmullen@BROWN.EDU>
Subject: Re: proc compare gotcha?
I guess since variable names are metadata, yes, in this example she was
reading metadata. But she was not "intentionally" reading metadata.
By "intentionally" I mean she wasn't using dictionary tables, or the
varname() function somewhere.
In this case, she had used proc ANOVA (I think), creating an output
dataset. And the output dataset included a variable (maybe EFFECT?), and
the value of the variable EFFECT was the name of each predictor in the
model. So for record 1, EFFECT ="Age", and for record 2 EFFECT ="Sex".
[I actually didn't look at her code, and haven't used ANOVA in years, so
it probably doesn't spit out a variable named EFFECT, but you get the
I guess she then read in the output dataset and had some code:
if EFFECT ="Age" then....
And when the dataset was rebuilt to replicate the analysis, someone had
changed the variable name from Age to AGE, so her test for effect="Age"
Of course any procs which turn variable names into data (e.g. transpose)
could cause similar problems. And in this case defensive use of upcase()
could have avoided the problem.
But it would have saved some time for her if PROC COMPARE had found the
difference. When I first heard the problem (proc compare 'confirmed' same
data in, she was running the same program, same version of SAS) my mind
went to thinking about hotfixes installed, OS patches, etc. Luckily she
solved it without asking me first. : )
And I appreciate Howard's point, it would be nice for COMPARE to reveal
differences in variable position as well. In fact, given the (too
frequent) use of x--y, I would put that as higher priority than
differences in capitalization of var names.
On Fri, 7 May 2004 17:22:49 -0400, Venky Chakravarthy
>You always have the most interesting cases.
>Perhaps, I have the Friday blues and I concede that dependencies on
>metadata may be a situation where a piece of code might fail because of
>variable name case differences. Otherwise, regular code should work fine,
>right? Can you give some specifics about how/why the code failed, if it is
>not metadata related?
>On Fri, 7 May 2004 14:59:03 -0400, Quentin McMullen
>>A colleague just got bit by an odd problem. She was rerunning some old
>>code and it didn't work on the first pass (gee, that never happens to
>>me : ) She mentioned she had done a proc compare on the old and new
>>datasets, and it found no differences, and the code hadn't changed. I
>>thought maybe a hotfix had been installed, or ... When she debugged, it
>>turned out one of the variable names had been changed from upper case to
>>Since proc compare does give a nice note when it finds two variables with
>>the same name that have different attributes (length, format, etc.), I
>>think it would be an improvemet to include in that output two variables
>>with the same name that have different capitalizations.
>>So I want the below code to produce something like:
>> Variable Dataset Type
>> a WORK.A Num
>> A WORK.B Num
>> I WORK.A Num
>> i WORK.B Num
>>Yes I know this could have been avoided e.g. with options
>>validvarname=upcase (or whatever), but since case did become an attribute
>>of variable names in v7, it would be nice for proc compare to compare
>>attribute, along with length, format, etc.
>>Does that seem like a reasonable expectation of proc compare? (It's one
>>my favorite procs...)
>>proc compare base=a compare=b;
>>*finds differences in var attributes;
>> length a 4;
>> format i mmddyy10.;
>>proc compare base=a compare=b;