LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2009, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Thu, 26 Nov 2009 17:45:32 -0800
Reply-To:   "data _null_;" <datanull@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "data _null_;" <datanull@GMAIL.COM>
Organization:   http://groups.google.com
Subject:   Re: 4.31: How can I split a [character] delimited string except when inside [character]? PERL FAQ
Comments:   To: sas-l@uga.edu
Content-Type:   text/plain; charset=ISO-8859-1

On Nov 26, 7:23 pm, Tom Abernathy <tom.aberna...@gmail.com> wrote: > SCANQ function works in SAS 9.2 but it is no longer in the > documentation. > I did find this reference in the 9.2 enhancements document: > > The SCANQ function and the CALL SCANQ routine have been removed from > the documentation and replaced by the superior functionality of the > SCAN function and CALL SCAN routine. > > I think they want you to use scan(str,n,dlm,'q') instead of scanq > (str,n,dlm). > > On Nov 26, 6:29 pm, "data _null_;" <datan...@gmail.com> wrote: > > > > > On Nov 26, 3:15 pm, xlr82sas <xlr82...@aol.com> wrote: > > > > Oh how they struggle with PERL(except when using packages - just like > > > R) > > > > The issue is dealing with commas within quotes. We don't want the > > > comma in "Cimetrix, Inc" > > > to be used as a separator. > > > > SAR001,"","Cimetrix, Inc" > > > > In SAS this does not work > > > > data tst; > > > tst = 'SAR001,"","Cimetrix, Inc"'; > > > put tst=; > > > tre=scan(tst,3,','); > > > put tre=; > > > run; > > > > but this does > > > > /* you can use the $quote with non-quoted strings */ > > > data tst; > > > infile cards dsd delimiter=','; > > > format nam $quote10. > > > mty $quote3. > > > bus $quote16.; > > > input nam mty bus; > > > put bus= mty= nam=; > > > cards; > > > SAR001,"","Cimetrix, Inc" > > > ; > > > run; > > > > Several modules can handle this sort of parsing--"Text::Balanced", > > > "Text::CSV", "Text::CSV_XS", and "Text::ParseWords", among > > > others. > > > > Take the example case of trying to split a string that is > > > comma-separated into its different fields. You can't use "split > > > (/,/)" > > > because you shouldn't split if the comma is inside quotes. For > > > example, > > > take a data line like this: > > > > SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N, > > > 8,1,0,7,"Error, Core Dumped" > > > > Due to the restriction of the quotes, this is a fairly complex > > > problem. > > > Thankfully, we have Jeffrey Friedl, author of *Mastering Regular > > > Expressions*, to handle these for us. He suggests (assuming your > > > string > > > is contained in $text): > > > > @new = (); > > > push(@new, $+) while $text =~ m{ > > > "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the > > > phrase inside the quotes > > > | ([^,]+),? > > > | , > > > }gx; > > > push(@new, undef) if substr($text,-1,1) eq ','; > > > > If you want to represent quotation marks inside a > > > quotation-mark-delimited field, escape them with backslashes (eg, > > > "like > > > \"this\"". > > > > Alternatively, the "Text::ParseWords" module (part of the standard > > > Perl > > > distribution) lets you say: > > > > use Text::ParseWords; > > > @new = quotewords(",", 0, $text); > > > scanQ > > > 92 data tst; > > 93 tst = 'SAR001,"","Cimetrix, Inc"'; > > 94 put tst=; > > 95 tre=scanq(tst,3,','); > > 96 put tre=; > > 97 run; > > > tst=SAR001,"","Cimetrix, Inc" > > tre="Cimetrix, Inc"- Hide quoted text - > > > - Show quoted text -- Hide quoted text - > > - Show quoted text -

I'm not using version 9.2, but I will be happy to use the enhanced SCAN when I start using 9.2 hopefully sometime soon.


Back to: Top of message | Previous page | Main SAS-L page