Date: Sat, 24 Sep 2011 20:00:47 +0000
Reply-To: "DUELL, BOB" <bd9439@ATT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "DUELL, BOB" <bd9439@ATT.COM>
Subject: Re: When using length statement doesn't change format width
In-Reply-To: <201109241904.p8OAlExS009915@waikiki.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"
Hi Art,
Are you saying that after you run the first data step (to create "test"), if you do "View Columns" from the SAS Explorer it shows the variable "name" with a format for $30.? It does not do that for me (running SAS 9.3 M3 on Windows); both the "Format" and "Informat" columns are blank (as I would expect). If that's really happening for you it sounds like some bug with your SAS Explorer. What version are you using?
With respect to your second example, putting the FORMAT statement before the SET statement "implies" a variable length of 30. If you move that statement after the SET statement, you should see the difference:
data test3;
set sashelp.class;
format name $30.;
run;
Since the variable "Name" already exists, this assigns a permanent format ($30.) to the variable, even though it has a shorter length (character 8). Back in the days when I did "line printer" reporting, I sometimes used longer formats to control spacing. These days I mostly need to use "shorter" formats to easily truncate values to "fit" into other applications.
Note that the Program Data Vector (PDV) is built during data step compilation. As each new variable is found (from either code or from data set references), it is added to the PDV. Statements like FORMAT, LENGTH, and ATTRIB are not executable, but they are processed in the order seen by the compiler. That's why you get an error message if you try to use a LENGTH statement on a variable that's already been defined, but you can have more than one of the other statements in the code. The last one found will rule:
data test4;
set sashelp.class;
format name $30.;
format name $20.;
run;
Above is valid; the format for "name" is stored as ($20.) and the length remains unchanged (8).
Bob
PS - I'm just sitting around watching football on TV; this is a pleasant distraction!
-----Original Message-----
From: Arthur Tabachneck [mailto:art297@ROGERS.COM]
Sent: Saturday, September 24, 2011 12:05 PM
To: SAS-L@LISTSERV.UGA.EDU; DUELL, BOB
Subject: Re: When using length statement doesn't change format width
Bob,
Thanks for the response, and I don't disagree with your recommended
approach, but the behavior is not that definitive. In fact, it highlights a
rather major (in my opinion) error in the system.
If you run the following code and, after each of the datasteps, bring up the
file in SAS Explorer, I think that you will see what I mean.
data test;
length name $30;
set sashelp.class;
run;
proc contents data=test details fmtlen out=out1;
run;
data test;
format name $30.;
set sashelp.class;
run;
proc contents data=test details fmtlen out=out2;
run;
If you open file test after the first datastep you will see that name is
shown to have a length of 30 and a format of $30. Out1, on the other hand,
shows that ONLY the length was set and that no format was assigned.
If you open file test after the second datastep you will see that name again
is shown to have a length of 30 and a format of $30. Out2, similarly, shows
that BOTH the length was set and that a $30. format was assigned.
As a non-programmer I have always relied on the method of bringing up a file
with SAS Explorer, right clicking on a particular variable, and selecting
column attributes. If one does that after having run the first datastep,
they will see that length is shown as 30, and both the format and informat
are shown as being $30.
Either proc contents is showing the wrong information, or the SAS Explorer
is showing the wrong information. I don't know which is correct but, if one
also changes one of the names to being a longer name, I'm led to think that
proc contents is the one showing the incorrect values. E.g., if you run:
data test;
length name $30;
set sashelp.class;
if _n_ eq 3 then name=
"FirstName MI VeryLongLastName";
run;
proc contents data=test details fmtlen out=out3;
run;
the same contradictory results are seen from the two methods (i.e., SAS
Explorer and proc contents), but the full name is shown in SAS Explorer and,
if you then run a proc print, the full name is also displayed.
Art
------
On Sat, 24 Sep 2011 16:50:58 +0000, DUELL, BOB <bd9439@ATT.COM> wrote:
>Hi Art,
>
>I don't know of any documentation that describes this behavior exactly as
you describe, but it is expected. There are four different attributes
associated with each SAS variable (LENGTH, FORMAT, INFORMAT, and LABEL) and
each has a corresponding SAS statement that can be used to change the
attribute. In other words, using the LENGTH statement only changes the
length, it does not affect the other attributes.
>
>For this reason, I almost always use the ATTRIB statement to define
"production" SAS variables in each data set. The ATTRIB statement allows
one to define all the attributes at once. For my very formal jobs (for each
permanent SAS dataset going to a "production" library), I follow up the
section of ATTRIB statements with a KEEP statement; although it does require
a bunch of typing, it serves as very good in-code documentation. And the
KEEP statement lets me use any other "temporary" variables in the code
without them showing up in my production data.
>
>I don't believe that using a LENGTH statement has ever simultaneously
changed the variable format. Note that the FORMAT attribute is optional.
My guess is that when you changed a variable length in the past and thought
the format length had also changed, it was because the variable did not have
any format specification to begin with. I could be wrong historically
(perhaps it did so in a previous SAS release).
>
>Hope this helps,
>
>Bob
>
>-----Original Message-----
>From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Arthur
Tabachneck
>Sent: Saturday, September 24, 2011 7:16 AM
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: When using length statement doesn't change format width
>
>Can someone point me to documentation that talks about when using a length
>statement, before a set statement in a data step, doesn't change the width
of
>the format statement?
>
>My experience has always been that using a length statement to change the
>width of a character variable results in simultaneously changing the width
of
>the format. However, I came across a case yesterday where that didn't
hold.
>Using a format statement, though, ended up changing both.
>
>Is that particular behavior documented somewhere?
>
>Thanks in advance,
>Art