LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2009, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 9 May 2009 07:02:28 -0700
Reply-To:     foolishfish.chen@GMAIL.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         foolishfish.chen@GMAIL.COM
Organization: http://groups.google.com
Subject:      Oversampling Questions. Please help
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset=ISO-8859-1

1. When selecting observations from good data, how many observation is ideal. In the other word, what is best good vs. bad ratio after oversampling?

2. I always deal with very rare even projects. I always have concern whether my random sample from the good data represents the whole population well (let's say 3k out of 500k)? Is there a way that I can select the sample that has similar characteristics of the whole population? for example similar mean and similar variance

3. Currently, how I am dealing with the issue above is to bootstrap the data and get n samples, run the regression/decision tree n times and ensemble n models in EM. But the results are not as expected.I am wondering if anyone has the same experience and help me out?

Can any expert help?

Thanks in advance.


Back to: Top of message | Previous page | Main SAS-L page