Date: Sun, 25 Jan 2009 19:20:21 +0000
Reply-To: Paul Dorfman <sashole@BELLSOUTH.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Paul Dorfman <sashole@BELLSOUTH.NET>
Organization: PDC
Subject: Re: Regex - Seven of Nine Matching
In-Reply-To: <84be906d-d685-4255-b724-1710bc87171b@p36g2000prp.googlegroups.com>
Alan,
But why... a Levenshtein implementation is already available via complev()
function, and if that is not enough, there is always compged() augmented with
call compcost() giving a much wider latitude in distance editing. Regardless,
it appears that any of these are way too expensive for Don's purposes,
forthey invariably have to examine all 9 characters, and then perform rather
heavy computations. Comparing the corresponding digits and applying as much
logical short-circuiting as much as possible (including that dictated by
intimate data knowledge) is better equipped to deal with (unavoidable?)
Cartesian product nature of the problem. I am using the question mark in
the parentheses deliberately... for I have not fully proven to myself yet that
no method to avoid comparing all to all exists.
Kind regards
------------
Paul Dorfman
Jax, FL
------------
> I know you know C# a bit now so this algorithmic approach may help you
> solve the issue. You could possibly code this in SAS or call it via C#
> and include it into the dataset.
>
> Just some late night ideas...
>
> Alan
|