Date: Fri, 9 Feb 2007 18:14:02 -0500
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Proc Summary Error..
In-Reply-To: <95AA6E2856F8924BBC646230BD46F5DC0266688A@CTTRBEXVS02.oxhp.com>
Content-Type: text/plain; charset="us-ascii"
Even I, the original SAS SQLhead, would assume that a special summary
procedure in SAS would faster than a more generic programming
environment such as SQL. That assumption may not hold in this case,
however, because the programmer has to make a choice between a memory
intensive method and a disk space (sort) alternative. So perhaps the SAS
SQL compiler could improve an equivalent query. How about
proc sql _method;
create table outp as
select lobcode, category, specialty, yrqrt, claimcod, plan, prv_stat,
prvcnty, memcnty, his_par, acctuw, acctuw2, yrmon, prvcode, provdx,
cpt, proccode, level3, prv_type, claimtyp, risk_deal,
servflag, prod, placecod2, groupcod,
sum(payamt) as tpayamt, sum(copayamt) as tcopayamt,
sum(dedamt) as tdedamt, sum(coinsamt) as tcoinsamt, sum(cobamt) as
tcobamt,
sum(reqamt) as treqamt, sum(wthldamt) as twthldamt,
sum(allowamt) as tallowamt, sum(dscntamt) as tdscntamt, sum(procqty) as
tprocqty
from (select * from saved.trendoutp1 where dosdte between
'01jan2003'd and '31dec2003'd)
group by lobcode, category, specialty, yrqrt, claimcod, plan,
prv_stat, prvcnty, memcnty, his_par, acctuw, acctuw2, yrmon, prvcode,
provdx,
cpt, proccode, level3, prv_type, claimtyp, risk_deal,
servflag, prod, placecod2, groupcod
;
quit;
In the worst case the query will attempt to sort the original dataset
and fail. Presorting using a tagsort may work better. Once data have
been ordered by the grouping variables, summing variables within the
group should not put a burden on memory or disk space.
A really smart SQL query optimizer would create an index for the
composite 'key' and sum the quantities for each group of rows with the
same index value. I don't expect that a typical SQL query optimizer will
do that. Most normalized databases have keys that identify different
dimensions of a complex relation. Still, one can hope ....
S
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of Kamireddi, Srinivas
Sent: Tuesday, February 06, 2007 1:41 PM
To: Wensui Liu
Cc: SAS-L@listserv.uga.edu
Subject: RE: Proc Summary Error..
Thanks for responding
The proc summary step is
proc summary data = outp nway missing;
where '01jan2003'd <= dosdte <= '31dec2003'd;
class lobcode category specialty yrqrt claimcod plan prv_stat prvcnty
memcnty his_par acctuw acctuw2 yrmon prvcode provdx cpt proccode level3
prv_type claimtyp risk_deal servflag prod placecod2 groupcod; var payamt
copayamt dedamt coinsamt cobamt reqamt wthldamt allowamt dscntamt
procqty; output out = saved.trendoutp1 (drop = _type_ rename = (_freq_
= lines)) sum = ; run;
the outp dataset has 25,683,373 records which gets summarized down to
23,402,550 records
the problem occurs whenever we do not use -sumsize 0, regardless of
using -memseize 0
-----Original Message-----
From: Wensui Liu [mailto:liuwensui@gmail.com]
Sent: Tuesday, February 06, 2007 1:25 PM
To: Kamireddi, Srinivas
Cc: SAS-L@listserv.uga.edu
Subject: Re: Proc Summary Error..
could you please post your proc summary code?
On 2/6/07, Kamireddi, Srinivas <ksriniva@oxhp.com> wrote:
> Hi,
>
> Running proc summary and I am getting this error
>
> "ERROR: PROC SUMMARY was terminated prematurely due to a memory
> shortage. Adjusting MEMSIZE and/or SUMSIZE may allow normal processing
> to complete."
>
> What are the best ways to solve this error without changing the
> memsize or sumsize
>
> Or
>
> What should be the memsize and sumsize set to
>
> Advises are requested
>
> Thanks,
> Srinvias
>
--
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)
|