Date: Thu, 3 Feb 2011 16:31:15 -0500
Reply-To: Ann Mackey <thearchies@LIVE.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ann Mackey <thearchies@LIVE.COM>
Subject: Re: Help with parsing a string
Content-Type: text/plain; charset="iso-8859-1"
Thanks for the fast response -
I'm looking for separate variables out of the text string. The .CN=, .OU= are consistent, and .CN= is always before .OU= - but the string may or may not contain them (and when it doesn't, there's no data for that field).
I'll work with your example and see what I get. During my many attempts, I didn't use prxmatch, and definately my prxchange statement had some issues (but it was my first try with the PRX* functions.
> From: firstname.lastname@example.org
> To: email@example.com; firstname.lastname@example.org
> Subject: RE: Help with parsing a string
> Date: Thu, 3 Feb 2011 21:11:42 +0000
> I would need ot know a whole lote more about the text you are trying to parse such as are the ou- and o= characters consistent or do the characters vary?
> Do you want the parts in multiple variables or just one?
> Given Idon;t know these things this gets what you stated you wanted with the data you sent
> which may or may not be correct for all the possible cases you have in your data:
> Data One ;
> Infile Cards Truncover ;
> Input Text $Char500. ;
> If PrxMatch( '/(?=apple1)/io' , Text ) ;
> Text2 = PrxChange( 's/(.*)cn=(.*)\.ou=(.*)\.o=.*/$1 $2 $3/io' , 1 , Text ) ;
> Put Text2= ;
> Cards ;
> 0001 apple1 00.25.Monkey@address.com.CN=I'll be a monkeys uncle.OU=Mocking Bird City.O=some other data.U=some other data . . .
> 0001 apple6 00679D46CKJL.CN=Help - I need someone.U=flower.O=some other data.U=some other data . . .
> Run ;
> Toby Dunn
> "I'm a hell bent 100% Texan til I die"
> "Don't touch my Willie, I don't know you that well"
> > Date: Thu, 3 Feb 2011 15:31:01 -0500
> > From: thearchies@LIVE.COM
> > Subject: Help with parsing a string
> > To: SAS-L@LISTSERV.UGA.EDU
> > I've and have tried many, MANY, things, but I'm just not getting it -
> > Neurons just aren't firing too bright today.
> > Here's a sample of the data - all on one line, the third chunk is over 200
> > characters - notice the many types of delimiters, spaces, ., =, '.CN=',
> > etc., and every variable can be a different length:
> > 1 6 13
> > 0001 apple1 00.25.Monkey@address.com.CN=I'll be a monkeys uncle.OU=Mocking
> > Bird City.O=some other data.U=some other data . . .
> > 0001 apple6 00679D46CKJL.CN=Help - I need someone.U=flower.O=some other
> > data.U=some other data . . .
> > I want to get all 0001 records, keeping the type (apple1), and then parse
> > out the first three sections of the last looong variable length field, in
> > this case it would be:
> > Record Type Serial CN OU
> > 0001 apple1 00.25.Monkey@address.com I'll be a monkeys uncle Mocking
> > Bird City
> > 0001 apple6 00679D46CKJL Help - I need someone flower
> > I've attempted this with scan, prxparse, indexw substr, DLM=, etc.
> > Any direction/help is greatly appreciated, and dare I say... eagerly
> > anticipated!!
> > Thanks,
> > Ann