|
Thank you both for your responses. While datatype="D" does occur in some of
the header records, the PNODE_TYPE var doesn't. So the solution with only
one infile statement works perfectly. I really appreciate your help - I
knew this had to be possible but couldn't get it to work despite much
fiddling.
Sarah
-----Original Message-----
From: Ian Whitlock [mailto:WHITLOI1@WESTAT.com]
Sent: Friday, September 12, 2003 4:09 PM
To: 'Crawford'
Cc: 'swhittier@ISO-NE.COM'
Subject: RE: reading from hierarchical files without datatype
identifier
Peter,
It really depends on the data whether two INFILEs are needed. I suspect
that DATATYPE="D" does not occur on the first 6 records. If true the step
can be simplified and look more "standard".
data new_lmps_0( label="read from &file" );
infile solve1 ;
if _n_ = 1 then
input /// @37 udscaseid $19.
// @38 lmpcaseid $16. ;
retain udscaseid lmpcaseid ;
input @1 datatype $1. @72 PNODE_TYPE $1. @;
if datatype="D" and PNODE_TYPE in ("H" "I" "Z") ;
input @29 date date11.
@41 time time5.
@48 pnode_name $15.
@80 lmp 9.2
@118 loss 8.2
@127 cong 8.2 ;
format date date. time time.;
run;
When using the two INFILE buffers it is important to use FILENAME statements
and filerefs so the buffers can have different names. You did this, but it
is important to know that you have to do it that way.
infile "..." ;
is not good enough when you want two data streams.
IanWhitlock@westat.com
-----Original Message-----
From: Crawford [mailto:PeterDOTCrawfordATblueyonder.co.uk@Peter.BITNET]
Sent: Friday, September 12, 2003 3:35 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: reading from hierarchical files without datatype identifier
Hi Sarah
To avoid a step joining to datasets, you could just do the
two infiles in one step... like
%let file =
c:\lmp\data\finalize\MSS_7020030728205502_0X.0_28-JUL-2003_16_55.lmpsolved ;
filename solve1 "&file" lrecl =3000; filename solve2 "&file" lrecl =3000;
data new_lmps_0( label="read from &file" );
infile solve1 firstobs=4 ;
input @37 udscaseid $19.
// @38 lmpcaseid $16. ;
do until( eof );
infile solve2 firstobs=1 ;
input @1 datatype $1. @72 PNODE_TYPE $1. @;
if datatype="D" and PNODE_TYPE in ("H" "I" "Z")
then do;
input @29 date date11.
@41 time time5.
@48 pnode_name $15.
@80 lmp 9.2
@118 loss 8.2
@127 cong 8.2 ;
format date date. time time.;
output;
end /* handling datatype D*/ ;
end /* handling a record from SOLVED2 */ ;
stop; /* end of file on solved2 */
run;
Comment:
instead of 2 data steps to read the data, I used 2 infile
statements
This provides 2 streams of data in the data step.
Rather than have to bypass handling the SOLVE1 file on all
data step iterations after the first, I put a
do until( eof );
loop around the reading of data from SOLVE2
It doesn't add a lot. Just the "end=eof" on the infile,
an output statement, and a stop statement to avoid having
the data step iterate a second time.
welcome to the wonderful data step
Good Luck
Peter Crawford
Crawford Software Consultancy Limited
UK
>
"Sarah Whittier" <swhittier@ISO-NE.COM> wrote in message
news:200309121818.h8CII6t09161@listserv.cc.uga.edu...
> I have a problem related to reading data from flat files into SAS. I
> am using Windows, and the files are text files. The files are
> essentially hierarchical files, however, the rows with the header data
> that I want are not easily identified (i.e., they don't have a
> consistent datatype identifier).
>
> I want to read certain key values from the 4th and 6th rows of the
> file
and
> have these values on each row of the data values in the output SAS
> dataset. I can achieve this by using two data steps with input
statements,
> and then a third to set them together. I would prefer to do this in
> just one data step, which I hope will make it easier to set up a
> program to
read
> in the 30 or so files that I have. Does anyone have a suggestion for
> modifying the program below to use just one data step? I hope this is
> sufficiently clear without including data.
>
>
> filename solved
> "c:\lmp\data\finalize\MSS_7020030728205502_0X.0_28-JUL-
> 2003_16_55.lmpsolved";
>
> data new_lmps_id;
> infile solved firstobs=4 obs=6;
>
> input @37 udscaseid $19.
> // @38 lmpcaseid $16. ;
> run;
>
> data new_lmps;
> infile solved;
>
> input @1 datatype $1. @72 PNODE_TYPE $1. @;
>
> if datatype="D" and PNODE_TYPE in ("H" "I" "Z") then do;
> input @29 date date11.
> @41 time time5.
> @48 pnode_name $15.
> @80 lmp 9.2
> @118 loss 8.2
> @127 cong 8.2 ;
> format date date. time time.;
> output;
> end;
> run;
>
> data new_lmps_2;
> set new_lmps;
> if _n_ = 1 then set new_lmps_id;
> run;
>
>
> Thank you,
>
> Sarah Whittier
> ISO-NE
|