Date: Thu, 21 Nov 2002 14:25:10 -0500
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Using contains with SQL
Content-Type: text/plain; charset="iso-8859-1"
A simple illustration of a fairly general method appears on
http://www.sconsig.com/
Search the SAS Tips and Techniques for 'Hermansen'. If you have further
questions, I may be able to answer them.
I've looked at a fuzzy indexing function that is being developed, as I
understand it, for one of the SAS V9.x's. Alternatively, the perl-style
regular expression processor might work well if you have more specific
patterns to match.
Sig
-----Original Message-----
From: Michael D. Boldin [mailto:mboldin@MINDSPRING.COM]
Sent: Wednesday, November 20, 2002 7:57 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Using contains with SQL
I have a dataset with about 100 company names and I want find close matches
to these names in another set that has 20,000 company names and an
associated ID number. I know that there are numerous data step ways to do
this for a single company ie where cname contains "ABC Corp";
and I could use these ways in a macro loop but I thought there would be more
elegant and flexible SQL way to handle all names in a single pass.
Unfortunately, all of may efforts to use 'contains' 'like' or 'index()' in
the same way as
on a.cname = b.name gives an error or this note
66 proc sql;
67 create table temp2
68 as select *
69 from temp1 as b
70 left join cnames as a
71 on a.cname contains b.cname1;
NOTE: The execution of this query involves performing one or more Cartesian
product joins that can not be optimized.
NOTE: Table WORK.TEMP2 created, with 5 rows and 15 columns.
and no joins are made (all comuns in temp2 are filled with missing values
except cname1
any ideas or is this hopeless?