Date: Wed, 21 Jun 2006 12:31:12 -0500
Reply-To: Rob Rohrbough <Rob@ROHRBOUGH-SYSTEMS.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Rob Rohrbough <Rob@ROHRBOUGH-SYSTEMS.COM>
Subject: Re: Selective Unquoting
Content-Type: text/plain; charset="us-ascii"
To: Mike Rhoads, Toby Dunn, Gerhard Hellriegel
Thank you all for replying to my post. It was late last night, and I was
frustrated. I went with Mike's solution since I was not interested in
eliminating the "undoubled" embedded quotes altogether. Both Toby's and
Gerhard's solutions involved the use of the COMPRESS function, which removes
all occurrences of the selected character.
I wanted only to remove the "quoting" occurrences - only the first and last.
While not needed for this solution, I am inclined to request an enhancement
to the COMPBL function (perhaps renaming it) to eliminate extra consecutive
occurrences of a specified character, not just blanks. There have been
times in the past when I have wished such a function were available.
Thanks again for your posts. The final solution appears below.
Omaha, NE, USA
infile datalines dsd truncover;
length Code $5;
length Short $40;
format AddDateQ mmddyy10.;
input Code : Short : AddDateQ : mmddyy.;
if substr(Short, 1, 1) = '"' and substr(Short, length(Short), 1) = '"'
Short = substr(Short, 2, length(Short)-2);
"DUMM1",""LEADING AND" EMBEDDED QUOTES",04/01/2002
"DUMM4","EMBEDDED" QUOTE ONLY",04/01/2002
> -----Original Message-----
> From: Mike Rhoads [mailto:RHOADSM1@WESTAT.com]
> Sent: Wednesday, June 21, 2006 8:09 AM
> To: Rob Rohrbough; SAS-L@listserv.uga.edu
> Subject: RE: Selective Unquoting
> Since your data are "irregular", I would just remove the quotes
> manually, rather than trying to use DEQUOTE, since I assume the latter
> is designed for "properly" quoted strings.
> Also, I wasn't sure why you subtracted one from the Length in your
> How about:
> if substr(Short, 1, 1) = '"' and substr(Short, length(Short), 1) = '"'
> Short = Substr(Short,2,length(Short)-2);
> This, of course, will not reduce any paired quotes that have been
> properly embedded, if you have any of those in your data.
> Mike Rhoads
> -----Original Message-----
> From: firstname.lastname@example.org
> On Behalf Of Rob Rohrbough
> Sent: Tuesday, June 20, 2006 11:02 PM
> To: SAS-L@listserv.uga.edu
> Subject: Selective Unquoting
> I am trying to import a comma-delimited dataset over which I have
> control. The data contain some embedded quotes that have not been
> for proper reading with the DSD option on the INFILE statement. I am
> to read them anyway - of course with some problems. A simple
> test case
> as follows:
> data test;
> infile datalines dsd truncover;
> length Code $5;
> length Short $40;
> format AddDateQ mmddyy10.;
> input Code : Short : AddDateQ : mmddyy.;
> if substr(Short, 1, 1) = '"' and substr(Short, length(Short)-1, 1) =
> Short = dequote(Short);
> "DUMM1",""LEADING AND" EMBEDDED QUOTES",04/01/2002
> "DUMM2","trail quote"",04/01/2002
> "DUMM3","NO QUOTE",04/01/2002
> "DUMM4","EMBEDDED" QUOTE ONLY",04/01/2002
> This example shows that my little after-the-fact unquoting
> code handles
> trailing quote, but not the record containing both in initial
> and middle
> "undoubled" embedded quote nor just the embedded quote.
> I can't understand why the unquoting code doesn't catch the first and
> datalines. If I examine the resulting "TEST" dataset, the quotes all
> there, and my desk tracing seems to indicate that the unquoting should
> Ideas? TIA,
> Rob Rohrbough
> Omaha, NE, USA