|Date: ||Thu, 25 Jun 2009 09:34:38 -0500|
|Reply-To: ||Carl Denney <cdenney@HEALTHINFOTECHNICS.COM>|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Carl Denney <cdenney@HEALTHINFOTECHNICS.COM>|
|Subject: ||Fwd: Re: Fuzzy matching question|
|Content-Type: ||text/plain; x-AttchRem=yes; charset="us-ascii"; format=flowed|
>But why don't you change it to the NDC code instead?
>At 12:44 PM 6/24/2009, you wrote:
>>We have found that unstructured reporting of medication introduces
>>a variety of problems. For instance,
>>- multiple genres, including brand names, generic names, codes, and
>>brand names differentiated by dosage;
>>- spelling and abbreviation differences;
>>- similar names;
>>- mixtures of comments, names, and codes.
>>You may find that appending distinct pairs of observed value and
>>standard drug identifier will give you a mapping that you can use
>>to classify strings. You might also add a match probability that
>>could help you order matches and select those with greater chances
>>of being a correct match.
>>From: SAS(r) Discussion [SAS-L@LISTSERV.UGA.EDU] On Behalf Of Paul
>>Sent: Wednesday, June 24, 2009 10:55 AM
>>Subject: Fuzzy matching question
>>I'm working with some cancer drugs and need to do some fuzzy
>>matching. I've experimented with a few different functions,
>>including upcase, lowcase, propcase, spedis, index, indexc, in
>>various combinations but have yet to find what I need.
>>The drugs I'm working with often go under a variety of different
>>names. The case in which the drug names are entered varies.
>>Sometimes they're misspelled. Sometimes the name of the drug is
>>combined with information about the method of administration (e.g.,
>>'civ', 'ci', 'ivp').
>>An example involving some very simple code appears below:
>>else if lowcase(drug_name) in ('fluorouracil' 'adrucil' '5-fu'
>>'5-fu civ' '5-fu ci' '5-fu ivp' '5fu') then agent = 'Fluorouracil';
>>Is there some way to elegantly combine SAS functions so that SAS
>>will look for terms that sound like/contain 'fluorouracil' or
>>'adrucil' or '5-fu' and then code them all as 'Fluorouracil'?
>>I've been able to find functions like index that simultaneously
>>look for different names (e.g., 'fluorouracil' 'adrucil') where the
>>spelling is exact. I've also been able to find functions like
>>spedis that allow me to do fuzzy matching for a single name (e.g.,
>>'fluorouracil') but not for different names simultaneouly (e.g.,
>>'fluorouracil' and 'adrucil'). So I'm just wondering if there's
>>some way to combine functions so that I get the best of both
>>worlds. Alternatively, I thought there might be some functions I'm
>>not aware of that could be put to good use.