LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 20 Jun 2008 17:42:21 -0400
Reply-To:     Peter Flom <peterflomconsulting@mindspring.com>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Flom <peterflomconsulting@MINDSPRING.COM>
Subject:      Re: Invitation for outlier tracking schemes
Comments: To: Mary <mlhoward@avalon.net>
Content-Type: text/plain; charset=UTF-8

Given that Frank Harrell wrote it, probably no one called tech support because it *worked*.

In R, Frank has a very similar program called "describe" which uses very good defaults based on the nature of the variable. For quantitative variables, it gives mean, about 10 quantiles, and the five largest and smallest. AND you can run it on a whole bunch of variables at once..... and it will do sensible things for each

Peter

-----Original Message----- >From: Mary <mlhoward@AVALON.NET> >Sent: Jun 20, 2008 3:32 PM >To: SAS-L@LISTSERV.UGA.EDU >Subject: Re: Invitation for outlier tracking schemes > >Probably nobody called tech support because Nat answered all the questions on SAS-L :-) > >-Mary > ----- Original Message ----- > From: Terjeson, Mark > To: SAS-L@LISTSERV.UGA.EDU > Sent: Friday, June 20, 2008 2:21 PM > Subject: Re: Invitation for outlier tracking schemes > > > Thanks, Nat. > Appreciate your discussion. > > ...as far as <<<"Apparently, since tech support almost never had any > calls about it, it was > assumed to not be used much.">>> > > ...and to think I used to think that no calls to tech support was a good > thing.... :o) > > Must be one of those new-tech-no-tech physical sciences. > > Mark > > > -----Original Message----- > From: Nathaniel.Wooding@dom.com [mailto:Nathaniel.Wooding@dom.com] > Sent: Friday, June 20, 2008 12:14 PM > To: Terjeson, Mark > Cc: SAS-L@LISTSERV.UGA.EDU > Subject: Re: Invitation for outlier tracking schemes > > Mark > > I typically work with the following types of data: > > 1 Fish names, lengths, and weights > 2 Water temperatures (hourly records) > 3 Results of water chemistry analyses from environmental samples. > > Much of my qa procedure involves graphics that are either static (ie, I > run > a plot) or use a scatter plot in SAS/Insight. With the latter, you can > click on a point of interest and then look at the particular > observation. > > Most of my chemistry data are highly variable depending on where they > were > collected and the nature of the site. Hence, with a few exceptions (eg, > a > pH is typically around 7 and definitely in the range 0 - 14 ), I really > cannot design any routines that would check them. > > The other day, someone mentioned the SAS Supplemental Library that went > away with V6. It had a neat little proc called DATACHEK that was written > by > Frank Harrell. Among a few other stats, it would give you the 5 highest > and > lowest values for each numeric variable in a dataset. I always included > this in jobs that read raw data and I have long mourned it's passing. > Apparently, since tech support almost never had any calls about it, it > was > assumed to not be used much. > > Nat > > Nat Wooding > Environmental Specialist III > Dominion, Environmental Biology > 4111 Castlewood Rd > Richmond, VA 23234 > Phone:804-271-5313, Fax: 804-271-2977 > > > > > "Terjeson, Mark" > > <Mterjeson@RUSSEL > > L.COM> > To > Sent by: "SAS(r) SAS-L@LISTSERV.UGA.EDU > > Discussion" > cc > <SAS-L@LISTSERV.U > > GA.EDU> > Subject > Invitation for outlier tracking > > schemes > > 06/20/2008 12:15 > > PM > > > > > > Please respond to > > "Terjeson, Mark" > > <Mterjeson@RUSSEL > > L.COM> > > > > > > > > > > Hi All, > > re: Invitation for outlier tracking schemes > > The identification of outliers we know can be > accommodated by a wide variety of approaches, > coding schemes, and procs. We also know that > the identification of outliers also can be very > dependent on data content rules, etc. > > This is an invitation for a wide variety of ideas > or methods you frequently use for locating > outliers in data series'. i.e. any favorite procs > or datastep/sql schemes you have used. > > Some schemes are good for significant single > outliers in a data series while may begin to > degradate quickly when more than one outlier > starts showing up in the series. Once severe > outliers are removed and the noise starts to > narrow the schemes for locating outliers must > become more complicated and sophisticated. > > Please mention your favorite schemes for the > easy tracking and those for trickier situations. > A brief quick comment on the pros&cons of > the approach would help describe where and > when the approach may be appropriately used. > > > > > Mark Terjeson > Senior Programmer Analyst > Investment Management & Research > Russell Investments > 253-439-2367 > > > Russell > Global Leaders in Multi-Manager Investing > > > The information contained in this message is intended only for the use > > of the recipient named above. This message may contain confidential > > or undisclosed information. If the reader of this message is not the > > intended recipient or an agent responsible for delivering to the > > intended recipient, you are hereby notified that you have received > > this message in error, and that any review, dissemination, > > distribution or copying of it is strictly prohibited. If you have > > received this message in error, please notify us by telephone > > immediately at 253-439-2367. Thank you for your cooperation. > > > > > > > > > > > ----------------------------------------- > CONFIDENTIALITY NOTICE: This electronic message contains > information which may be legally confidential and/or privileged and > does not in any case represent a firm ENERGY COMMODITY bid or offer > relating thereto which binds the sender without an additional > express written confirmation to that effect. The information is > intended solely for the individual or entity named above and access > by anyone else is unauthorized. If you are not the intended > recipient, any disclosure, copying, distribution, or use of the > contents of this information is prohibited and may be unlawful. If > you have received this electronic transmission in error, please > reply immediately to the sender that you have received the message > in error, and delete it. Thank you.

Peter L. Flom, PhD Statistical Consultant www DOT peterflom DOT com


Back to: Top of message | Previous page | Main SAS-L page