LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 19 Oct 2010 16:04:27 -0400
Reply-To:   Michael Raithel <michaelraithel@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Michael Raithel <michaelraithel@WESTAT.COM>
Subject:   Re: SAS vs Oracle Efficiency Question
In-Reply-To:   <201010191843.o9JH5kcf012853@willow.cc.uga.edu>
Content-Type:   text/plain; charset="us-ascii"

Dear SAS-L-ers,

Andy Arnold posted the following:

> PROBLEM/QUESTION: > I need to make a SAS program run faster. Currently (See 1 below) the > vast > majority of the time is spent in the Oracle phase. If I shift some > processing to SAS, can I save enough time to make it worth the trouble? > > BACKGROUND: > I have a audit program that calculates a target value, compares it to > the > master value in the record, and notes any exceptions. The data is such > that > all the records with identical values in 4 specific fields will > generate the > same target value and the same result from the comparison. The input > file > is an Oracle table with from 100M to 500M records and the 4 specific > fields > can have from 1K to 100K combinations. Although the 4 specific fields > are > not key fields, they become a 'key' for the purpose of this program. > > Both SAS and Oracle are on AIX servers, but not the same server. > > I see at least 3 ways to build/change the program. > > 1. Select the records in Oracle and use the Count() and Group By > clauses to > return 1 record per key and a count of the number of records with that > key. > I think this requires Oracle to extract the records and sort them > before > they can be grouped into the result set. After the records are > returned to > SAS, the target value is calculated and compared. > > 2. Select the records in Oracle and use SAS a hash object. Put 1 > record > into the hash object for each key and keep the count in the hash > object. > This option trades Oracle sorting for SAS I/O. Hash processing may > offset > the increased I/O load. After the hash object is built, the target > value is > calculated and compared. > > 3. Select the records in Oracle without combining them and use SAS to > calculate the target value for each record. This option is so I/O > bound > that it does not warrant further consideration; it is included here for > completeness. > > Thanks for your reading time & any advice you might have. > > --Andy Arnold > > p.s. > I've looked in both SAS and Oracle and I cannot a suitable GoFast or > HurryUp command.

Andy, I think that I may have found what you are looking for: a veritable GoFast and/or HurrUp command for SAS/Oracle!

By some strange coincidence of fate and time and space, last week I finished benchmarking something way-cool that is new in SAS 9.2 TS2M3: SAS In-Database Procedures for Oracle. SAS in-database procedures were introduced at a SAS Global Forum several years ago, but they were originally built for Teradata databases. I had been stalking them for several years, waiting for them to be implemented in either Oracle or SQL Server; both of which we use a lot here at SAS Mecca. I almost fell out of my chair a few weeks ago when SAS-L heavyweight Mike Rhoads casually told me that they were now available for Oracle!

When enabled, SAS in-database procedures translate common SAS procedures into Oracle-specific SQL code and pass the code to the Oracle database. There, they are executed within Oracle, and the result sets are passed back to your SAS program. This is in lieu of what happens now--SAS pulls all qualifying rows/variables out of the Oracle database, drags them across the network, and then processes them in the procedures on the AIX server, your Windows workstation, or whatever... Having the database do all of the crunching and heavy lifting facilitates two performance boosts: 1. Reduces the network bandwidth related to dragging lots of Oracle data between computers, and 2. Allows the beefier server to crunch the data faster than your little 'ole PC would/could. So, you can drastically reduce the time it takes to run certain procedures (I got about 70% reductions in wallclock time) by using SAS in-database procedures.

Only a select number of SAS base and stat procedures are in-database enabled, including PROC FREQ, PROC SUMMARY (and its cousin PROC MEANS), PROC TABULATE, and PROC RANK (despite the smell). So, you will have to go to the doc to get an idea of the toolset at your disposal:

SAS/Access 92. for Relational Databases: Reference, Third Edition

http://support.sas.com/documentation/cdl/en/acreldb/63283/HTML/default/viewer.htm#a003362045.htm

Invoking SAS in-database procedures is fun and easy; simply set the SQLGENERATION option to DBMS:

options sqlgeneration = dbms;

proc options option=sqlgeneration;

run;

And, to see how it is working, make sure to specify the SASTRACE option:

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

As a final note, check out this very well-written; very helpful SAS Global Forum 2010 paper by SAS Institute smarties Scott Mebust and Robert S. Ray:

SAS Presents In-Database Base Procedures in Practice

http://support.sas.com/resources/papers/proceedings10/300-2010.pdf

Integrating SAS in-database procedures into your approach to crunching your Oracle data may just be the thing you need to save you time, time, and more time!

Andy, best of luck in all your SAS endeavors!

I hope that this suggestion proves helpful now, and in the future!

Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Michael A. Raithel "The man who wrote the book on performance" E-mail: MichaelRaithel@westat.com

Author: Tuning SAS Applications in the MVS Environment

Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172

Author: The Complete Guide to SAS Indexes http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Make of it what you will. - Leif Enger +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Back to: Top of message | Previous page | Main SAS-L page