Date: Fri, 17 May 2002 10:13:42 -0400
Reply-To: Quentin McMullen <QuentinMcMullen@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Quentin McMullen <QuentinMcMullen@WESTAT.COM>
Subject: Re: determining variable type
Content-Type: text/plain; charset="iso-8859-1"
Hi Bruce,
Thanks for the reply. I agree that my suggestion of adding a length
statement could lead to other problems. But I disagree with some of your
other points. : ) I've added some notes below.
> From: Ace [mailto:b.rogers@VIRGIN.NET]
> >Quentin wrote:
> >Of course the fix (and better practice in general), is to
> add a length
> >statement at the top to define both the type and length of
> the var1, but I
> >thought this was interesting.
>
> Your conclusion is correct, and it's a bit odd, certainly, but an
> argument type must be decided at compile time, whereas a simple
> comparison is only done at execution time, if you see the difference.
I disagree here. My understanding is that the PDV is created during compile
time. And in order to create the PDV, SAS must know each variable's name,
type, length, etc. So when you code x=1 during compile time SAS decides x
is numeric because you are assigning it a numeric value. When you code
y="1" SAS decides during compile time it is character of length 1 because
you are assigning a character value of length 1. So during compile time
whenever the first reference to a variable is made (regardless of whether
that is an assignment, a comparison, or on a set statement), SAS decides
it's name, type, and length.
> I don't agree, however, that adding a length statement for an existing
> dataset variable is a good solution. Consider if you accidentally gave
> it a smaller length than that on the dataset - every value would then
> be truncated with no warning whatsoever.
I agree this is a danger. Good point. User beware.
> There are a couple of ways around this. One would involve a dummy set
> statement above your loop, to ensure that all the program variables
> are correctly defined. Something like: IF 0 THEN SET A; will suffice.
Yep, this is a nifty way of doing it.
> In general, however, I'd always recommend that manual read-loops are
> handled with a single read before the loop and a do while(not
> end_of_file) type logic. This was how I was taught structured
> programming 20-odd years back and has served me well since.
>
> To implement this in SAS requires a linked block, as two seperate SET
> statements would open the input dataset twice, which is not what's
> required here. So something like the following would do the trick:
>
> data b;
> link readit;
> do while (^eof & upcase(var1)^='A');
> link readit;
> end;
> return;
> readit:
> set a end=eof ;
> return;
> run;
When I ran this I got the exact same error. Because (I think), as before,
SAS needs to decide during compile-time what type to make var1. And the
first reference to it is the upcase(var1)^='A'. And despite evidence to the
contrary (upcase function, character string), SAS makes it numeric.
11 data b;
12 link readit;
13 do while (^eof & upcase(var1)^='A');
14 link readit;
15 end;
16 return;
17 readit:
18 set a end=eof ;
ERROR: Variable var1 has been defined as both character and numeric.
19 return;
20 run;
NOTE: Numeric values have been converted to character values at the places
given by:
(Line):(Column).
13:27
> Note that I would also nearly always include the ^eof condition in the
> while clause, even if I think the other condition(s) would be
> satisfied and end the loop. Just good practice, I guess.
I guess if you are using do-while with a set statement inside this is a way
to avoid an infinite loop. But with the do-until structure you don't have
to do this, since SAS will stop once it tries to read past the end of the
file. In some sense, this is more "natural" SAS processing (e.g. you avoid
"SAS stopped due to looping" notes, etc.).
Kind Regards,
--Quentin