LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 22 Feb 2006 11:06:50 -0800
Reply-To:     "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Choate, Paul@DDS" <pchoate@DDS.CA.GOV>
Subject:      Re: studying repeats
Comments: To: "lrbowes2@yahoo.com" <lrbowes2@YAHOO.COM>
Content-Type: text/plain; charset="us-ascii"

If the data aren't too large (<64k lines) I often put this sort of data into Excel (with the libname engine) and use a pivot table to look at the problems. For me it's very interactive and lets me rapidly drill across and down and inspect the situation. I then use this to inform my SAS programming. If the data are large I often subset or sample it and do the same.

I agree that keeping your key fields separate is important - using multiple level sorts instead. This allows one to also look for problems within the levels independently.

hth

Paul Choate DDS Data Extraction (916) 654-2160

> -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of > lrbowes2@yahoo.com > Sent: Tuesday, February 21, 2006 2:02 PM > To: SAS-L@LISTSERV.UGA.EDU > Subject: studying repeats > > Hi. > > I need to look at how many times and which various combinations of data > are repeated (zip code and time period). (Ideally, there should only > be one instance of each zip code, time period combo, so I want to learn > more about cases where that is false.) I created a variable that is > the concatenation of the values of zip code and time period. What is > the best way of getting summary stats on the repeats? > > I don't want to just use nodup because I want to know what was repeated > and how many times it was repeated. > > Ideally, I guess I would maybe do a proc freq on the concatenated > variable, but I want to get rid of all instances where the frequency is > 1. > > Thanks. > > -Lori


Back to: Top of message | Previous page | Main SAS-L page