Date: Tue, 5 Sep 2006 09:22:05 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Enterprise miner problem
Content-Type: text/plain; charset="iso-8859-1"
Sounds like a good opportunity for you to analyze what is happening and find a good solution....
I'd look first at obvious bottlenecks. Is EM trying to transfer millions of rows in a table on a server to another node on a network. Network transfer speeds will slow things down. Are you following the dubious practice of searching very large numbers of predictors for a relevant model? (If so, search the SAS-L Archives for postings on 'step-wise selection'.) Moving large volumes of data takes time.
A good 150,000 row sample of a larger database should give you good estimates of all but very small proportions. The key word is 'good'. If you know little about your data source and choose a sample with an underrepresentation of an important subgroup, you could minimize the importance of the subgroup and find a good fit for a bad model.
EM and all other automated data mining programs may facilitate modelling but, as with all statistical packages, also make it too easy to go astray. Don't expect easy answers.
Remember to train models on one sample, test models on another independent sample, and, after selecting a specific model, evaluate its performance on one or more samples not used in training and testing. You have enough data to implement that strategy.
Best wishes for exciting challenges and success in your academic work.
From: firstname.lastname@example.org on behalf of email@example.com
Sent: Tue 9/5/2006 7:51 AM
Subject: Enterprise miner problem
I am a student and working on a project. I have enterprise miner and
working on a on a dataset (there is millions of records) which is on
server. The problem is when i tried to connect data with in "input data
node", the computer and SAS software become very slow. I had wait about
more than 30 minutes to get responce. I have taken a sample from the
same dataset (more than 150,000 records) and saved on local drive and
it works properly. If this problem persis, is there any solution
without using enterprise miner GUI?