Date: Thu, 6 Nov 2003 09:36:05 -0000
Reply-To: ben.powell@cla.co.uk
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ben Powell <Ben.powell@CLA.CO.UK>
Organization: cla
Subject: Re: Scan Array for matches
Content-Type: text/plain; charset="us-ascii"
Resent with correct subject:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Harry, thanks. Good idea. I was having trouble getting the index
solution from Sigurd to work as it seemed temperamental as regards field
length matching and didn't give. I find the % marked like option much
more reassuring, didn't think of it. Couldn't see how to get like to
work against a variable rather than a string!
Ben.
------------------------------
Date: Tue, 4 Nov 2003 12:38:04 -0500
From: "Droogendyk, Harry" < <mailto:Harry.Droogendyk@CIBC.COM>
Harry.Droogendyk@CIBC.COM>
Subject: Re: Scan Array for matches
Something tells me the sub-query in the 'not exists where' clause can be
simplified... However, this does work. Note the addition of the
percent signs to your keyword list.
data Name;
length name $20;
input name $;
cards;
test
meme
temp
testme
droogendyk_harry
hairy
run;
data keyword;
length keyword $8;
input keyword $;
cards;
%me%
%harry%
%other%
run;
proc sql;
select a.name
from name a
where not exists (
select b.name
from name b, keyword
where a.name = b.name
and a.name like keyword
)
;
quit;
-----Original Message-----
From: Gerhard Hellriegel
[mailto:ghellrieg@T-ONLINE.DE]
Sent: November 4, 2003 11:56 AM
To: <mailto:SAS-L@LISTSERV.UGA.EDU>
SAS-L@LISTSERV.UGA.EDU
Subject: Re: Scan Array for matches
Not to clear, really! Do you want to scan each record of
dataset2 for each
of the keywords in dataset 2? That could be a large
amount of steps, you
know! For that you might use a macro. Respond if you
need that.
Or is it like you said: you want to find out exact
matches in set1 / set2.
Thats easy:
sort both datasets, rename or assign the contents of
set2_keyword to name
and merge the two datasets.
data match;
merge sort1(in=in_name)
sort2(in=in_key);
if in_key and in_name;
by name;
run;
On Tue, 4 Nov 2003 16:11:16 -0000, Ben Powell <
<mailto:Ben.powell@CLA.CO.UK> Ben.powell@CLA.CO.UK> wrote:
>Dear SAS-L would be Perl users,
>
>Not knowing Perl (or SAS apparently) how would I do
this in
SAS: I have
>two datasets. set1 contains all items. set2 contains
keywords from items
>to be excluded. How do I scan through the full name
field in set1
>looking for items where the full name is like the
keyword - the keyword
>is effectively a substring? What if there is more than
one keyword, as
>in keyword1 keyword2, etc.
>
>I'm sorry I can't be more clear than this!
>
>Here are some data examples:
>
>set1 -
>
>Name
>test
>test
>test
>meme
>test
>test
>test
>
>set2 -
>keyword
>me
>other
>other
>
>Ben.
>
>----------------
>Ben Powell, Data Analyst
>Copyright Licensing Agency Ltd (CLA) - <
<http://www.cla.co.uk/> http://www.cla.co.uk/>
>Tel: +44 7631 5532 Fax: +44 7631 5500
>< <mailto:ben.powell> mailto:ben.powell @cla.co.uk>
>
>****
>Whilst the above is believed to be correct, it is
provided for
>information
>only and should not be relied on. It does not
constitute legal advice
>and
>if you have a specific problem you are strongly advised
to consult a
>solicitor or other appropriately qualified
professional. Neither myself
>
>nor CLA hold itself out as an expert and neither can
accept any
>responsibility or liability for any loss or damage
incurred as a result
>of
>relying on information contained in this email.
------------------------------
----------------
Ben Powell, Data Analyst
Copyright Licensing Agency Ltd (CLA) - <http://www.cla.co.uk/>
Tel: +44 7631 5532 Fax: +44 7631 5500
<mailto:ben.powell @cla.co.uk>
****
Whilst the above is believed to be correct, it is provided for
information
only and should not be relied on. It does not constitute legal advice
and
if you have a specific problem you are strongly advised to consult a
solicitor or other appropriately qualified professional. Neither myself
nor CLA hold itself out as an expert and neither can accept any
responsibility or liability for any loss or damage incurred as a result
of
relying on information contained in this email.