Date: Thu, 18 Nov 2004 08:58:52 -0500
Reply-To: Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Subject: Re: proc format vs. merge
On Fri, 12 Nov 2004 19:00:34 -0700, Jack Hamilton
<JackHamilton@FIRSTHEALTH.COM> wrote:
>
>I suppose there are too many lookup records to put them all into a v9
>associative array?
Absolutely... the table is 14GB and growing daily
>
>
>Enhanced generic lookup capabilities using indexes in the data step
>would be welcome, yes.
>
>What about
>
>=====
> proc sql;
> create table claims2 as
> select claims1.*
> from claims1, elig
> where elig.policy_no = claims1.policy_no
> ;
> quit;
>=====
>
>Doesn't that give the same result as your SQL? I think it would be
>faster.
>--
>JackHamilton@FirstHealth.com
When I first tried this, it still took far too long.
Adding _method shows that "where" was still optimised to a hash join.
24 create table userd.select_gios_policy_sql as
25 select contract, version, partition, pono
............... more columns
29 , dtechang, firstday, lastday
30 from gios.policy
31 , gios.changedpolicies( rename=( contract=changed ))
32 where contract = changed
/********* COMMENTS *********/
36 order by contract, firstday, version
37 ;
NOTE: SQL execution methods chosen are:
sqxcrta
sqxsort
sqxjhsh
sqxsrc( GIOS.POLICY )
sqxsrc( GIOS.CHANGEDPOLICIES )
Setting a low value for buffersize, stopped the hash join, but then
it tried to sort !
99 proc sql _method _tree buffersize=10000;
100 create table userd.select_gios_policy_sql as
101 select contract, version, partition, pono
.................
105 , dtechang, firstday, lastday
106 from gios.policy as p
107 join gios.changedpolicies( rename=( contract=changed ))
108 on contract = changed
112 order by contract, firstday, version
113 ;
NOTE: SQL execution methods chosen are:
sqxcrta
sqxsort
sqxjm
sqxsort
sqxsrc( GIOS.CHANGEDPOLICIES )
sqxsort
sqxsrc( GIOS.POLICY(alias = P) )
The changedPolicies table is in order, so I added sortedby
gios.changedpolicies( rename=( contract=changed )sortedby = changed )
The _method picture became
NOTE: SQL execution methods chosen are:
sqxcrta
sqxsort
sqxjm
sqxsrc( GIOS.CHANGEDPOLICIES )
sqxsort
sqxsrc( GIOS.POLICY(alias = P) )
The Policy table is indexed on POLID, a composite of contract and
two other columns, which aren't all in the changedPolicies table.
However, Contract comes first in the key, so I can try
gios.policy( idxname= polid sortedby= contract) as p
Then _method shows
NOTE: SQL execution methods chosen are:
sqxcrta
sqxsort
sqxjm
sqxsrc( GIOS.CHANGEDPOLICIES )
sqxsrc( GIOS.POLICY(alias = P) )
So I can eliminate the sql sorts, but it just forces a merge join.
So it is no surprise that it still seems to take a very long time !
Many thanks for the ideas, about partial key set statement support.
I'll be adding my support on the ballot, and persuading my
clients' developers how much they should involve themselves,
too.
Regards
Peter
>
>Doesn't that give the same result as your SQL? I think it would be
>faster.
>
>
>
>
>--
>JackHamilton@FirstHealth.com
>Manager, Technical Development
>Metrics Department, First Health
>West Sacramento, California USA
>
>>>> "Peter Crawford" <peter.crawford@BLUEYONDER.CO.UK> 11/12/2004 12:38
>PM >>>
>On Fri, 12 Nov 2004 18:51:28 GMT, Matthew Wilson
><matt@OVERLOOK.HOMELINUX.NET> asks about using user formats
>for 1 to 1 look-ups rather than use a data step merge.
>Matt's full post follows my sig lines below.
>
>Matt
>On the internals of user formats you've had the definitions from Roger
>Lustig and Jack Hamilton, so may I discuss data step merges and their
>shortcomings for look-up processing.
>
>I'm working on replacing some data step merges with almost anything I
>can
>find........ Unfortunately in my context, formats are not suitable..
>
>Imagine collecting details from a large(40m rows) denormalised table,
>for a few hundred (or even thousand) cases.
>Anywhere I can obtain index keys for the selection list, an sql join
>wins,
>especially when the key is non-unique.
>Sadly, new business analysts are attracted by the simple syntax
>(apol's for adopting -most- of your sample code )
|