LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2010, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 23 Sep 2010 15:13:44 -0400
Reply-To:     Ming Chen <chenming@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Ming Chen <chenming@GMAIL.COM>
Subject:      why random access is slower?
Content-Type: text/plain; charset=ISO-8859-1

I have to do a left join one table with 4 other tables.

Following is the table information:

Big1 : 45 millions rows with 10 vars, 3.5GB compressed in size

Big2 : 17 million rows with 5 vars, 1GB compressed

Big3 : 8 million rows with 5 vars, 800MB compressed

Small1 : 54 thousand rows with 12 vars

Small2 : 300 rows with 5 vars

I thought I came with a faster join methods by doing these following steps:

1. give each table a new column with the positions of each row. (for example, Seq = _*n*_)

2. join the 5 tables together by only keeping the keys and position column in each table

3. Based on the result of steps, using random access date set options point to generate the final results.

What surprised me is that step 3 is very slow and it is going to take 24 hours to get the job done. I have tried the SGIO option but with very limited improvement. the bottleneck is the I/O. Is there is way to improve? I don't know whether direct proc sql can do any better if I have enough SAS work space available to let me try out.

Thanks.


Back to: Top of message | Previous page | Main SAS-L page