Date: Wed, 2 May 2007 20:04:04 -0700
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: Sample data set with about five or more predictor and one
Content-Type: text/plain; format=flowed
firstname.lastname@example.org sagely replied:
>David L Cassell <davidlcassell@MSN.COM> wrote
> >Wouldn't it be a lot better to build the data sets yourself?
> >SAS is an excellent data generation tool, you know. And building the
> >data sets yourself ensures that all the features you want will be found
> >the data, while none of the nightmares you want to avoid will be
>David is, as usual, entirely correct.
>But in some circumstances, it is better to NOT know what features are
>If one wants to practice some actual data analysis, then one might be
>better served analyzing a data set where one does NOT know all about the
>Is that point an outlier? hmmmm
>Is the distribution normal? Well, it's CLOSE to normal....is it close
>Why doesn't the model fit?
>Not that I am disagreeing with David. It just depends on what the original
>poster wanted to do
I take the point of view that a 'weird' value is only an outlier after the
matter experts get through examining it and the QC people have checked it
throughly. After all, in lots of data sets it is the outliers which carry
interesting information, and throwing them out would therefore be bad. Ask
any astrophysicist. :-)
David L. Cassell
3115 NW Norwood Pl.
Corvallis OR 97330
Need a break? Find your escape route with Live Search Maps.