Date: Wed, 9 May 2007 21:09:54 -0400
Reply-To: Lou <lpogodajr292185@COMCAST.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Lou <lpogodajr292185@COMCAST.NET>
Subject: Re: Finding all the values of an index variable
""Keintz, H. Mark"" <mkeintz@WHARTON.UPENN.EDU> wrote in message
news:0F48071CAE88E940892B3883297EE84705CD0286@RITTENHOUSE.wharton.upenn.edu...
> I have a SAS dataset indexed on, say, variable IX. I need a list of
> the all the values IX takes in the dataset, but I don't need the
> frequencies of those values. The dataset is so large that reading the
> entire dataset takes too long - that's why we have the index to begin
> with.
>
> =20
>
> So does anyone know of a way I can read the corresponding SAS7BNDX file
> to get a list of values of IX? I.e., is there any tool available for
> parsing the root/branch/leaf structure of a sas7bndx file?
>
I know of no tool provided by SAS that will parse an index file, but why not
try some empirical research? For instance, if I run the following code on
my (wintel, SAS version 8) machine:
data test;
constant = 'this is a constant';
key = 'aaaaa'; output;
key = 'bbbbb'; output;
key = 'ccccc'; output;
key = 'ddddd'; output;
key = 'eeeee'; output;
key = 'aaaaa'; output;
key = 'bbbbb'; output;
key = 'ccccc'; output;
key = 'ddddd'; output;
key = 'eeeee'; output;
stop;
run;
proc sql;
create index key on test;
quit;
proc fslist file = 'C:\Documents and Settings\0\Local Settings\Temp\SAS
Temporary Files\_TD684\test.sas7bndx';
quit;
an index file is created and opened in browse mode in FSLIST. If I scroll
to the end of the file, I can see the key values. For whatever reason, in
this example anyway, the values are in reverse order - 'eeeee' displays
first and 'aaaaa' displays last. (In FSLIST, you can turn COLS, NUMS, and
HEX on to help you navigate through the file and see precisely what's
there). If you don't have FSLIST available, open the index file in your
text editor of choice (a hex editor would probably be best). From there, I
can copy and past the values into the application of my choice for further
cleanup and manipulation. If you display in hex, it should be possible to
figure out how to use SAS to read the index file to gather the necessary
list of values, though frankly I think the job could be done easier and
faster in a versatile text editor like KEdit.
Maddeningly, the index file structure seems to vary over repeated runs of
the same code, so I'd hesitate to say it's possible to write a general
program that could successfully read any index file, but presumably you're
not reconstructing the index minute by minute so this is a one off. Also,
if your key values are numeric, it'll be more difficult/impossible - numeric
values apparently are encoded in some way in bit strings that don't display
as nicely readable numbers, but perhaps they can be translated to something
useable with a judicious application of skull sweat (after all, SAS manages
to use them).
If this doesn't work in your situation, I think you have no choice but to
set up a program that will read the entire file and save unique key values,
kick it off, and go do something else while it chugs away. Either that, or
forget it - I don't know of any way to get the contents of a dataset without
reading the dataset.
|