Date: Sun, 29 Apr 2001 19:20:21 -0700
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Karsten M. Self" <kmself@IX.NETCOM.COM>
Subject: Re: problems with delimiter on ascii file
In-Reply-To: <R4%G6.49$CW6.firstname.lastname@example.org>; from young_sh@PACBELL.NET
on Sun, Apr 29, 2001 at 03:43:39PM -0700
Content-Type: multipart/signed; micalg=pgp-sha1;
on Sun, Apr 29, 2001 at 03:43:39PM -0700, news (young_sh@PACBELL.NET) wrote:
> I am reading in a delimited file with delimiter='|||' and the first 27
> records read in fine and then it appears the file has an odd number of
> delimiters in between two variables.for the rest of the file.
> For example:
> joe smith|||123 street|||chicago|||il|||11111
> bob smith|||235 street|||||||11111
> Is there a trick in the input statement to read the variable in spite of
> missing information and an odd number of delimiters.
Sy: whitespace is good. Add a linefeed between your paragraphs.
The traditional SAS approach is to use the DSD and DLM= INFILE options.
However, in your case the delimiter is a sequece of three instances of
the delimiter rather than a single instance.
If there is never a valid occurance of "|" in the data, you could treat
the dataset as delimited, but only retaining one in three fields.
If there are occurances of "|" in the data, you might want to experiment
with SAS's NIH-inspired regular expression syntax. I've never bothered
learning it but it could likely be applied to this problem.
Another option is to preprocess the data file and replace each occurance
of three "|||" with a single instance of another delimiter, eg, a tab.
A sed script which would accomplish this is (the '^I' represents a tab
sed -e '/|||/s//^I/g' < infile > outfile
sed is a stream editor originally for the Unix environment that has been
ported to most commonly used operating systems.
Karsten M. Self <email@example.com> http://kmself.home.netcom.com/
What part of "Gestalt" don't you understand? There is no K5 cabal