Date: Wed, 18 Apr 2007 08:29:41 -0700
Reply-To: aleph <donald.owen@CHOICEPOINT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: aleph <donald.owen@CHOICEPOINT.COM>
Organization: http://groups.google.com
Subject: Reading delimited data on multiple lines
Content-Type: text/plain; charset="iso-8859-1"
I am trying to input a delimited file in which some (not all)
observations
are on multiple lines; the number of lines is unpredictable. [SAS
9.1.3 UNIX]
The following example code produces the desired output:
DATA OK;
INFILE DATALINES DLM="#" DSD;
INPUT W X A :$13. Y Z;
DATALINES;
1#2#HELLO KIDDO##8
3#4#GOODBYE KIDDO#9#7
5#6#MAMA WINS###
;
RUN;
CORRECT OUTPUT
Obs W X A Y Z
1 1 2 HELLO KIDDO . 8
2 3 4 GOODBYE KIDDO 9 7
3 5 6 MAMA WINS . .
This code reads sample data that closely resembles the actual data I'm
working with:
DATA NOTOK;
INFILE DATALINES DLM="#" DSD;
INPUT W X A :$13. Y Z;
DATALINES;
1#2#HELLO KIDDO##8
3#4#GOODBYE
KIDDO#9#7
5#6#MAMA WINS###
;
RUN;
INCORRECT OUTPUT
Obs W X A Y Z
1 1 2 HELLO KIDDO . 8
2 3 4 GOODBYE . 9
3 5 6 MAMA WINS . .
Note that the value for A in obs 2 has 2 errors; all other values are
correct.
The actual raw data set includes 37 vars and 32 obs - I can fix the
problem
in this case by manually deleting LFs until each obs is on a single
line, but
I need to produce code that will handle potentially much larger files.
I have tried a number of tricks but nothing even comes close to
resolving
the problem.
I would be most grateful for any insight.
Thanks,
Don