LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 14 Jan 2007 20:56:28 -0500
Reply-To:     Ken Borowiak <EvilPettingZoo97@AOL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Ken Borowiak <EvilPettingZoo97@AOL.COM>
Subject:      Re: Extracting word(s) occurring in text before a certain keyword

On Sun, 14 Jan 2007 05:29:53 -0800, Hakan Ener <hakanener99@YAHOO.COM> wrote:

> Hello, > > I could not find a general solution to what I'm >trying to do when analyzing a character variable that >contains unstructured text. > > Each observation contains a paragraph of text >(multiple sentences separated by period), where names >of certain companies are mentioned, such as "Microsoft >Inc." or "Advanced Micro Devices Corp." within >sentences. I want to extract the company name that >precedes "Inc." or "Corp." in this text. Considering >that company names may contain any number of words >(each of which have a capital first letter), and that >an observation may contain any number of company names >one after the other, is there a suggestion to handle >this coding such that the result will be a horizontal >array of full company names mentioned in the source >field? > >Thank you, > >Hakan Ener >France >

Hakan,

Regular expressions in conjuntion with the PRX functions can help you out. If you post some sample observations and a somewhat complete set of what anchors the company name (e.g. Inc., Corp.), I could cook up something more concrete.

Ken


Back to: Top of message | Previous page | Main SAS-L page