LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2007, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 23 Apr 2007 23:55:21 +0000
Reply-To:     toby dunn <tobydunn@HOTMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         toby dunn <tobydunn@HOTMAIL.COM>
Subject:      Re: Index file too large- is this a problem?
Comments: To: rajasekhargo@YAHOO.COM
In-Reply-To:  <200704232300.l3NMCkhw023847@mailgw.cc.uga.edu>
Content-Type: text/plain; format=flowed

Sounds like you shouldnt use ID as part of your index. ID will yeild no better results than if you just sorted and merged off of Id because it is too discrete. So I doubt you will find much of an increase in speed.

Toby Dunn

You can see a lot by just looking. ~Yogi Berra

Do not seek to follow in the footsteps of the wise. Seek what they sought. ~Matsuo Basho

You never know what is enough, until you know what is more than enough. ~William Blake, Proverbs of Hell

From: Raj <rajasekhargo@YAHOO.COM> Reply-To: Raj <rajasekhargo@YAHOO.COM> To: SAS-L@LISTSERV.UGA.EDU Subject: Index file too large- is this a problem? Date: Mon, 23 Apr 2007 19:00:15 -0400

Hello,

We have a SAS dataset with just 3 variables- an ID, date and a return field. There are about 20,000 unique IDs and each of them has a return value on each date (ranging over the last 30 years). We stored this data cumulatively. So the table has 20,000 * 280 business days = 5.6 million records. We usually query a small percentage of data from this table (e.g. 1 year of data for one ID ~ 280 records). So I thought it would make sense to create a composite index on ID and date. But since there are so many unique key values, the index file size turned out to be quite huge (~ 1GB). The size of original dataset is not significantly larger than this (because there is just one additional variable).

When a query is submitted in SAS, I know that the index file is first loaded into memory and then the selected records are displayed. But since our index file is so huge, will this cause any problem? FYI, we have a client-server setting where all SAS processing happens on one central server. If multiple clients are querying the same dataset simultaneously, will SAS load multiple copies of this index file per user, OR can it use the same index file already loaded into memory?

If there is any better design of table that can store the same data, I welcome any suggestions.

Thanks in advance,

Raj

_________________________________________________________________ The average US Credit Score is 675. The cost to see yours: $0 by Experian. http://www.freecreditreport.com/pm/default.aspx?sc=660600&bcd=EMAILFOOTERAVERAGE


Back to: Top of message | Previous page | Main SAS-L page