Date: Fri, 11 Jan 2002 10:08:12 -0800
Reply-To: "William W. Viergever" <wwvierg@ATTGLOBAL.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "William W. Viergever" <wwvierg@ATTGLOBAL.NET>
Subject: Re: MS Excel column widths
In-Reply-To: <F97B12D29E75D211B75700A0C9DD020A02D3DB94@lxkan716nts.lu.fi
d-intl.com>
Content-Type: text/plain; charset="us-ascii"; format=flowed
At 11:14 PM 1/10/2002 +0000, Vyverman, Koen wrote:
>LS,
>
>Michael Kemp (original post included below the .sig) asks an inte-
>resting question. Paraphrasing: when using DDE to read data into SAS
>from an Excel worksheet, how to optimally define the lengths of SAS
>character variables? In other words, is there a way to make Excel
>cough up the maximal length of string-type data in a given column,
>and thereby evading the creation of SAS variables that are either
>too short, or unnecessarily long with respect to the incoming data.
>
>Whether Excel does some internal book-keeping to keep track of such
>things, I know not. But the answer to Michael's question is luckily
>affirmative. As is my wont, I have included below a self-contained
>slab of code demonstrating a possible solution. All this runs fine
>on my WinNT4.0(SP6) box, with SAS8.2 and Office97. Copy to the pro-
>gram editor, submit step-by-step, and observe the results ...
>
>A few words about the technical bits are in order however. The core
>of the solution as presented below relies on the usage of a beast
>known as an Excel Formula Array (don't ask me...). In the present
>case, the needed formula array is as follows:
Awesome, Koen ... simply awesome ....... but those damn arrays .... and
Send.Keys?!?!?!?! <vbg>
As an alternative, perhaps you can just do a VBA macro and then call that?
Note: in the code below I'm spitting out the max length via a MsgBox; one
will have to play with that to be able to pass that parm back to SAS ....
that's left to the masses <g>
Sub CheckColWid1()
Dim MyArr() As Integer
Dim i As Integer, j As Integer
Dim temp As Integer
Dim Last As Integer
Dim cell As Range
'Find the upper bound of the array
Last = Intersect(Columns("a"), ActiveSheet.UsedRange) _
.Cells.Count
i = 1
ReDim MyArr(1 To Last)
'Fill the array with the cell lengths
For Each cell In Intersect(Columns("a"), _
ActiveSheet.UsedRange)
MyArr(i) = Len(cell.Text)
i = i + 1
Next cell
'Sort the array
For i = 1 To Last
For j = i + 1 To Last
If MyArr(i) < MyArr(j) Then
temp = MyArr(j)
MyArr(j) = MyArr(i)
MyArr(i) = temp
End If
Next j
Next i
'The first element of the array will be the
'longest text in column A
MsgBox MyArr(1)
End Sub
Hmmmm ... as I'm writing this, I think there may be an easier way ....
testing ... yup, here it is:
Sub CheckWid2()
Dim cell As Range, Rng As Range
Dim Longest As Integer
Set Rng = Intersect(Columns("a"), ActiveSheet.UsedRange)
Longest = 0
For Each cell In Rng
If Len(cell.Text) > Longest Then
Longest = Len(cell.Text)
End If
Next cell
MsgBox Longest
End Sub
> {=max(len(a1:a1000))}
>
>Simply typing this expression into a cell on the worksheet accom-
>plishes nothing. The proper way to enter the above Formula Array is
>as follows. Type
>
> =max(len(a1:a1000))
>
>into a cell. Do not hit the return or enter-key to confirm your en-
>try, but press CONTROL-SHIFT-RETURN. That's the way it works ...
>
>In order to get SAS/DDE to perform such acrobatics, we must resort
>to the use of the Excel macro function SEND.KEYS, which makes the
>solution less elegant, as this requires the Excel application to
>be in the foreground. On the other hand, it is the only way I know
>of to perform a CONTROL-SHIFT-RETURN, so it'll have to do. A further
>complication is that parentheses cannot be included as such in the
>argument of SEND.KEYS, but need to be masked by pairs of braces.
>Which is the reason why the first SEND.KEYS below looks perhaps
>slightly garbled. The second SEND.KEYS instruction is equivalent to
>pressing CONTROL-SHIFT-RETURN and activates the Formula Array ...
>
> %* Launch an instance of the Excel application, and define a DDE *;
> %* system doublet fileref SAS2XL. *;
> options noxsync noxwait xmin;
> filename sas2xl dde 'excel|system';
> data _null_;
> length fid rc start stop time 8;
> fid=fopen('sas2xl','s');
> if (fid le 0) then do;
> rc=system('start excel');
> start=datetime();
> stop=start+10;
> do while (fid le 0);
> fid=fopen('sas2xl','s');
> time=datetime();
> if (time ge stop) then fid=1;
> end;
> end;
> rc=fclose(fid);
> run;
>
> %* Create a SAS data set STRINGY_DATA with 1000 observations and *;
> %* a single variable STRING which gets populated with a random *;
> %* number (maximally 1500) of uppercase A-Z characters. We will *;
> %* dump this data set into Excel to generate a testing workbook. *;
> data stringy_data(drop=i j k);
> length
> string $ 1500
> i j k 8;
> do i=1 to 1000;
> string='';
> k=1500*ranuni(0);
> do j=1 to k;
> string=trim(left(string))||byte(int(65+26*ranuni(0)));
> end;
> output;
> end;
> run;
>
> %* Before we proceed, we compute the maximum length of STRING in *;
> %* SAS, and write the resulting number to the log so as to be *;
> %* able to compare with the Excel outcome later on. Note that *;
> %* since we used RANUNI 1000 times to pick STRING lengths up to *;
> %* 1500, it should come as no surprise that the resulting length *;
> %* tends to get pretty close to 1500. *;
> proc sql noprint;
> select
> trim(left(put(max(length(string)),4.)))
> into :
> maxlen
> from
> stringy_data
> ;
> quit;
>
> %put ===> SAS says: Max length is &maxlen;
>
> %* Close the default workbook, create a new one containing a re- *;
> %* gular worksheet (by default named SHEET1), toggle the Excel *;
> %* error checking off, save somewhere to fixate the filename. *;
> data _null_;
> file sas2xl;
> put '[file.close(false)]';
> put '[new(1)]';
> put '[error(false)]';
> put '[save.as("c:\temp\X Find Max Col Len")]';
> run;
>
> %* Define a DDE triplet-style fileref pointing to 1000 cells in *;
> %* the first column of SHEET1. *;
> filename recrange dde
> 'excel|[x find max col len.xls]sheet1!r1c1:r1000c1' notab;
>
> %* Dump the contents of STRINGY_DATA there. Note the increased *;
> %* LRECL to accommodate for the long SAS variable. Clear the *;
> %* fileref. *;
> data _null_;
> set stringy_data;
> file recrange lrecl=1500;
> put
> string;
> run;
>
> filename recrange clear;
>
> %* Save the workbook, insert an Excel macro-sheet, named MACRO1 *;
> %* by default. *;
> data _null_;
> file sas2xl;
> put '[save]';
> put '[workbook.insert(3)]';
> run;
>
> %* Define a DDE triplet-style fileref pointing to 10 cells in the *;
> %* first column of MACRO1. *;
> filename xlmacro dde 'excel|macro1!r1c1:r10c1' notab lrecl=200;
>
> %* We write an old-style Excel macro there, and subsequently run *;
> %* it. The macro brings the Excel application to the foreground, *;
> %* an unfortunate necessity for using the SEND.KEYS function, ac- *;
> %* tivates our data sheet SHEET1, and employs the SEND.KEYS func- *;
> %* tion to enter an Excel formula array into the R1C2 cells, *;
> %* which evaluates to the maximum length of the first 1000 cells *;
> %* in the first column. Observe that the formula array requires *;
> %* the use of _named_ cell references A1:A1000. It will not work *;
> %* with row-column style ranges like R1C1:R1000C1. *;
> data _null_;
> file xlmacro;
> put '=app.activate("microsoft excel - x find max col len.xls")';
> put '=workbook.activate("sheet1")';
> put '=select("r1c2")';
> put '=send.keys("=max{(}len{(}a1:a1000{)}{)}")';
> put '=send.keys("^+{return}")';
> put '=halt(true)';
> put '!dde_flush';
> file sas2xl;
> put '[run("macro1!r1c1")]';
> run;
>
> %* Clear the fileref after running the Excel macro. *;
> filename xlmacro clear;
>
> %* Delete the MACRO1 sheet, save the workbook. *;
> data _null_;
> file sas2xl;
> put '[workbook.delete("macro1")]';
> put '[save]';
> run;
>
> %* The maximum length as perceived by Excel now sits in R1C2 of *;
> %* the worksheet SHEET1. A simple reading operation stores it in *;
> %* a SAS macro variable XLMAX. *;
> filename xlmax dde "excel|sheet1!r1c2" lrecl=200;
>
> data _null_;
> length
> xlmax 8;
> infile xlmax;
> input xlmax;
> call symput('xlmax',trim(left(put(xlmax,4.))));
> run;
>
> filename xlmax clear;
>
> %* And lo ... *;
> %put ===> XL says: Max length is &xlmax;
>
>I hope this will help? As a matter of fact, one of the things on my
>to-do-in-case-of-insane-amounts-of-time-on-hands-list is writing a
>SAS macro that, given the physical path of an Excel workbook, would
>generate SAS data sets similar to the VTABLE and VCOLUMN SAS dictio-
>nary tables, holding such information as: names of sheets on the
>workbook, type of sheet, rows and columns used for each sheet, column
>labels (or whatever sits in the first row), maximal value length for
>each column, ... I'm sure this kind of meta-data would be useful
>enough for those who attempt to optimize and automate the reading in
>of Excel data, be it via DDE or by means of proc import.
>
>Kind Regards,
>Koen.
>
>---------------------------------
>Koen Vyverman
>Database Marketing Manager
>Fidelity Investments - Luxembourg
>---------------------------------
>
> > -----Original Message-----
> > From: Kemp, Michael [mailto:Michael.Kemp@GS.COM]
> > Sent: Thursday, January 10, 2002 17:25
> > To: SAS-L@LISTSERV.UGA.EDU
> > Subject: MS Excel column widths
> >
> >
> > I have often wondered how that "Autofit" command (in Excel) works
> > internally. I mean, I can see that it could work, if once issued,
> > the application, runs down the column, finds the value with the
> > greatest length, and then adjusts the column width accordingly.
> > However, it seems to work immediately upon execution, regardless
> > of the number of rows. Obviously, then, no such processing is
> > taking place. If this is true, then there must be some sort of
> > internal record being kept about the length of values whenever a
> > value is added. I believe that a record of the length of the
> > longest value is internally stored, and whenever a new value is
> > added, it compares the length of the new value to the existing
> > value (which represents the length of the PREVIOUSLY longest
> > value). If the length of the new cell value is larger than that
> > already existing value, the existing value is replaced by the new.
> > If not, no action is taken.
> > Then, when the Autofit command is initiated, it simply refers to
> > this internal data table, makes reference to this value and then
> > adjusts the column width accordingly.
> > Assuming, again, that all of this is true, is there a way to
> > access this piece of data. I am loading Excel data into SAS, and
> > it would be very helpful to know the length of the longest value
> > in a given column. This is obviously not a SAS question, but since
> > I think many of you use SAS in the same fashion as this, I thought
> > you might have more insight into this issue.
> > Thanking you in advance.
> >
> > Mike Kemp
> > Goldman Sachs
> > 100 Second Ave. South, Suite 200N
> > St. Petersburg, FL 33701
> > Phone 727-825-3830
> > Fax 727-825-3834
------------------------------------------------------------
William W. Viergever Voice : (916) 483-8398
Viergever & Associates Fax : (916) 486-1488
Sacramento, CA 95825 E-mail : wwvierg@attglobal.net
------------------------------------------------------------
|