Date: Mon, 10 Apr 2006 20:54:45 -0500
Reply-To: Jiann-Shiun Huang <Jiann-Shiun.Huang@AMERUS.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jiann-Shiun Huang <Jiann-Shiun.Huang@AMERUS.COM>
Subject: Re: Help! Mystery Proc SQL results
Content-Type: text/plain; charset=ISO-8859-1
Phil:
The problem comes from min(null, 'H') is 'H' and max(null, 'H') is also 'H' as evidenced by the following SQL statement and its output. See also min in online spec. To get around, you may use rank function as the statement follows the current one.
_____________________________________________________________________
First SQL
Proc SQL;
Select PIN, DOS, min(HospInd) as minimum, max(HospInd) as maximum
From Step1
Group by PIN, DOS;
quit;
**** Output ****
The SAS System 07:55 Monday, April 10, 2006 38
PIN DOS minimum maximum
**************************************
123456 01/17/2006 H H
_________________________________________________________________________________
Second SQL
Proc SQL;
Create table step2 as
Select PIN, DOS
From Step1
Group by PIN, DOS
having min(rank(HospInd)) lt max(rank(HospInd));
quit;
proc print data=step2;
run;
**** Output ****
The SAS System 07:55 Monday, April 10, 2006 39
Obs PIN DOS
1 123456 01/17/2006
J S Huang
1-515-557-3987
fax 1-515-557-2422
>>> <phillip.anderra@GMAIL.COM> 04/10/06 8:19 PM >>>
Probably one of those times where veins stand out on your head and you
shout, "THAT'S GOT TO WORK" at the screen for 20 minutes before someone
points out that you're selecting the wrong table, used a period instead
of a comma or somesuch embarrassment.
Anyway, I have a table which has a field named 'HospInd' (Char1) for
which the only values used are "H" or " ". What I'm trying to find is
instances where a patient (PIN) has had HospInds with *both* of those
values on the same date (DOS).
The source data for one PIN in the table shows...
PIN DOS HospInd
123456 17Jan06 H
123456 17Jan06
but for some reason my first choice Proc...
Proc SQL;
Create table step2 as
Select PIN, DOS
>From Step1
Group by PIN, DOS
Having min(HospInd) < max(HospInd);
doesn't produce the above PIN in it's output.
The investigative Proc;
Proc SQL;
Create table step2 as
Select PIN, DOS, min(HospInd) as MinHI, max(HospInd) as MaxHI
>From Step1
Group by PIN, DOS;
produces this incorrect result...
PIN DOS MinHI MaxHI
123456 17Jan06 H H
It gets more confusing/interesting when I change the selecting Proc
to...
Proc sql;
Create table step2 as
Select distinct a.PIN, a.DOS, a.HospInd as aHI, b.HospInd as bHI
>From Step1 as A, Step1 as B
Where a.PIN = b.PIN
and a.DOS = b.DOS
and a.HospInd < b.HospInd;
This produces the correct result of...
PIN DOS aHI bHI
123456 17Jan06 H
So could someone *please* point out the obvious reason as to why the
Proc using the Min & Max values isn't working before I fetch the
sledgehammer?
ps. No error messages in the log.
cheers
Phil