Date: Sat, 13 Mar 2010 15:36:37 -0500
Reply-To: Dave <DAVID.BREWER@UC.EDU>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dave <DAVID.BREWER@UC.EDU>
Subject: How to create a new grouped variable
Hi All,
Let me give you a little background of my problem.
I am processing results from different lab tests and the name of each lab
test is not unique, which therein lies my problem.
The first thing I need to do is keep only the specimens (topography) and
lab tests of interest. I do this with WHERE processing and the LIKE
statement.
For example, a lab test name for "glucose" can vary
from "glucose", "GLUCOSE", "1HR SERUM GLUCOSE", "2 Hr. GTT GLUCOSE", etc.
All of the abouve tests need to be collapsed to a new
variable "MAPPED_NAME" with a value "GLUCOSE". I need to do this type of
collapsing for 20 different lab test NAMES.
My code thus far:
DATA LABS ;
retain rx RX1;
if _n_ = 1 then
rx = prxparse('/BLAST|CYTIC|CYTE|CELL|OPHIL|BAND|PLACENT|ENDOMETR
|EMBRYO|SEMINAL|SEMEN|SPUTUM|VOMIT|URINE|FECES|FECAL|PARIETAL|OCCULT|YELLOW
|PURPLE|ORANGE|GREEN|RED|TOP|BLUE /i');
if _n_ = 1 then
rx1 = prxparse('/GLUCOSE /i');
SET SAVEIT.LABS_DB(DROP=MAPPED_NAME) ;
WHERE (
( UPCASE(SPECIMEN) LIKE '%BLOOD%' )
OR
( UPCASE(SPECIMEN) LIKE '%PLA%' )
OR
( UPCASE(SPECIMEN) LIKE '%SER%' )
OR
( UPCASE(SPECIMEN) LIKE '%ARTER%' )
OR
( UPCASE(SPECIMEN) LIKE '%VENO%' )
OR
( UPCASE(SPECIMEN) LIKE '%BLD%' )
)
AND
(
( UPCASE(NAME) LIKE '%BUN%' )
OR
( UPCASE(NAME) LIKE '%GLUCOSE%' )
OR
( UPCASE(NAME) LIKE '%UREA NITROGEN%' )
OR
( UPCASE(NAME) LIKE '%CREATININE%' )
OR
( UPCASE(NAME) LIKE '%SODIUM%' )
OR
( UPCASE(NAME) LIKE '%CHOLESTEROL%' )
OR
( UPCASE(NAME) LIKE '%ALBUMIN%' )
OR
( UPCASE(NAME) LIKE '%BILIRUBIN%' )
OR
( UPCASE(NAME) LIKE '%SGOT%' )
OR
( UPCASE(NAME) LIKE '%CPK%' )
OR
( UPCASE(NAME) LIKE '%WBC%' )
OR
( UPCASE(NAME) LIKE '%HCT%' )
OR
( UPCASE(NAME) LIKE '%PLT%' )
OR
( UPCASE(NAME) LIKE '%PTT%' )
OR
( UPCASE(NAME) LIKE '%FIO2%' )
OR
( UPCASE(NAME) LIKE '%PCO2%' )
OR
( UPCASE(NAME) LIKE '%PO2%' )
OR
( UPCASE(NAME) LIKE '%CO2%' )
OR
( UPCASE(NAME) LIKE '%TROPONIN%' ) OR
( UPCASE(NAME) LIKE 'PH %' ) OR
( UPCASE(NAME) LIKE 'PH(%' ) OR
( UPCASE(NAME) LIKE 'PH*%' ) OR
( UPCASE(NAME) LIKE 'PH,%' ) OR
( UPCASE(NAME) LIKE 'PH-%' ) OR
( UPCASE(NAME) LIKE 'PH@%' ) OR
( UPCASE(NAME) LIKE 'PH1%' ) OR
( UPCASE(NAME) LIKE 'PH2%' ) OR
( UPCASE(NAME) LIKE 'POC PH%' ) OR
( UPCASE(NAME) LIKE 'PH.%' ) OR
( UPCASE(NAME) LIKE 'ZPH%' )
) ;
matchloc = prxmatch( rx, specimen );
if matchloc > 0 then flag = 1;
matchloc2 = prxmatch( rx1, NAME );
if matchloc2 > 0 then MAPPED_NAME = "GLUCOSE";
;
RUN;
Do I need to create a prxmatch for each lab test as I did for creating the
MAPPED_NAME for GLUCOSE? As you can see, the biggest problem is
identifying the correct "PH" string to keep.
Any thoughts or suggestions would be greatly appreciated.
Thanks for your time.
Dave