Date: Fri, 15 Sep 2006 11:17:10 -0700
Reply-To: "Pardee, Roy" <pardee.r@GHC.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Pardee, Roy" <pardee.r@GHC.ORG>
Subject: Re: option obs= and SQL
Content-Type: text/plain; charset="us-ascii"
My understanding is that those options don't work for the engine that
handles WHERE clause processing--if you use WHERE (in sql or dstep) that
gets applied to the whole table, and *then* the OBS stuff gets applied.
Check out this prior sas-l discussion on it:
http://tinyurl.com/phwr2
Cheers,
-Roy
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Rickards, Clinton (GE Money)
Sent: Friday, September 15, 2006 10:32 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: option obs= and SQL
All,
I have a multi-million row table that takes me about an hour to spin
through using simple SQL. In order to get a preliminary test done of my
whole program (to weed out variable changes, etc) I added "option
obs=5000;" at the beginning of the program. My understanding was that
obs= restricts the number of observations being read and expected that
adding the option would drop my execution time to insignificant. I was
suprised that the job still took about 30 minutes to spin through the
file and create a 5000 row data set.
So, does obs= work "backwards" in SQL? That is, does it allow the SQL to
process as much data as it wants at the front end but then stop when the
back-end (the output) reaches the obs= value? If so, are there any
techiques that can be used to speed up SQL?
My log:
66 options obs=5000;
125 %let SYSLAST = %nrquote(raparch.PAYMENTS);
132
133 proc sql;
134 create table work.payments_history as
135 select CLIENT_ID length = 20,
136 ACCOUNT_KEY length = 8,
137 MERCHANT_ACCOUNT_NBR length = 20,
138 GL_ALPHA_CODE length = 20,
139 MERCHANT_KEY length = 8,
140 TRANSACTION_AMT length = 8,
141 POSTING_DATE length = 8,
142 (datepart(TRANSACTION_DATE )) as
TRANSACTION_DATE_date length = 8 format = DATE9. informat = DATE9.,
143 TRANSACTION_CODE length = 20,
144 PAYMENT_TYPE length = 1,
145 CURRENT_ACCOUNT_NBR length = 20,
146 RAP_ADD_DATE length = 4
147 from &SYSLAST
148 where
149 TRANSACTION_CODE in ('271') and
150 datepart(TRANSACTION_DATE) ge (&DCLoad_date
- 27) and
151 GL_ALPHA_CODE not in (&GL_ALPHA_CODE_list);
NOTE: Table WORK.PAYMENTS_HISTORY created, with 5000 rows and 12
columns.
152 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 33:33.95
cpu time 8:05.12
Your insights would be appreciated...
Clint Rickards
Dual Card Fraud Strategy Manager
GE Money
Shelton, CT
(Internal) 8*370-6156
(External) (203) 944-6156
|