LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2010, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 14 Oct 2010 08:42:51 -0700
Reply-To:     Arthur Tabachneck <art297@NETSCAPE.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Arthur Tabachneck <art297@NETSCAPE.NET>
Subject:      Re: Finding duplicate records with non-matching data
Content-Type: text/plain; charset=ISO-8859-1

Keith,

You should have sashelp.class, but here is a version that doesn't need it. I would run the following code after closing SAS and starting a brand new session:

/*Generate some test data*/ data All_Bills; input EffectDate $ ITAmnt BillNumb; cards; John 7 1 John 7 1 John 7 1 John 7 1 John 7 1 Joe 12 3 Joe 12 3 Joe 15 3 Harry 13 2 Joe 12 3 Joe 11 3 Mary 14 4 Mary 12 4 Mary 15 4 Mary 16 4 Mary 12 4 Mary 15 4 Mary 11 4 Mary 15 4 ;

/*Sort to find duplicates*/ PROC SORT DATA= All_Bills; BY EffectDate BillNumb ITAmnt; RUN;

*Find all duplicates, then all duplicates with unmatched data in ITAmnt field; DATA All_Duplicates; SET All_Bills; retain hold_ITAmnt; BY EffectDate BillNumb ITAmnt; /*find all duplicates*/ IF not(FIRST.BillNumb and LAST.BillNumb) THEN DO; if first.BillNumb then hold_ITAmnt=ITAmnt; else if ITAmnt ne hold_ITAmnt then unmatched=1; OUTPUT; end; RUN;

/*Sort to find unmatched*/ PROC SORT DATA= All_Duplicates; BY EffectDate BillNumb descending unmatched; RUN;

DATA UnMatched_Duplicates (drop= hold: unmatched gotone); set All_Duplicates; BY EffectDate BillNumb descending unmatched; retain gotone; if first.BillNumb then do; call missing(gotone); if unmatched then gotone=1; end; if gotone then output; run;

HTH, Art ---------- On Oct 14, 9:02 am, Friar Broccoli <elia...@gmail.com> wrote: > On Oct 13, 6:59 pm, Arthur Tabachneck <art...@netscape.net> wrote: > > > > > > > Friar, > > > In answer to your first question/statement, if your code does what you > > want, excellent! I would think that you need an extra data step, > > e.g.: > > > /*Generate some test data*/ > > data summ_lib.All_Bills; > > set sashelp.class (drop=sex height weight > > rename=( > > name=EffectDate > > age=ITAmnt > > )); > > if _n_ < 6 then do; > > EffectDate="John"; > > BillNumb=1; > > ITAmnt=7; > > end; > > else if _n_ eq 9 then do; > > EffectDate="Harry"; > > BillNumb=2; > > end; > > else if _n_ < 12 then do; > > EffectDate="Joe"; > > BillNumb=3; > > end; > > else do; > > EffectDate="Mary"; > > BillNumb=4; > > end; > > run; > > > /*Sort to find duplicates*/ > > PROC SORT > > DATA= SUMM_LIB.All_Bills > > OUT = SUMM_LIB.All_Bills; > > BY EffectDate BillNumb ITAmnt; > > RUN; > > > *Find all duplicates, then all duplicates with unmatched data in > > ITAmnt field; > > DATA SUMM_LIB.All_Duplicates; > > SET SUMM_LIB.All_Bills; > > retain hold_ITAmnt; > > BY EffectDate BillNumb ITAmnt; > > /*find all duplicates*/ > > IF not(FIRST.BillNumb and LAST.BillNumb) THEN DO; > > if first.BillNumb then hold_ITAmnt=ITAmnt; > > else if ITAmnt ne hold_ITAmnt then unmatched=1; > > OUTPUT; > > end; > > RUN; > > > /*Sort to find unmatched*/ > > PROC SORT > > DATA= SUMM_LIB.All_Duplicates > > OUT = SUMM_LIB.All_Duplicates; > > BY EffectDate BillNumb descending unmatched; > > RUN; > > > DATA SUMM_LIB.UnMatched_Duplicates (drop= > > hold: unmatched gotone); > > set SUMM_LIB.All_Duplicates; > > BY EffectDate BillNumb descending unmatched; > > retain gotone; > > if first.BillNumb then do; > > call missing(gotone); > > if unmatched then gotone=1; > > end; > > if gotone then output; > > run; > > Tried to run your code (in many different ways) but got (variations > on): > > VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV > 2078 > 2079 / > *------------------------------------------------------------- > 2080 SUMMARY REPORTS FOR ALL DAYS. > 2081 > -------------------------------------------------------------*/ > 2082 TITLE; > 2083 *MARKER; > 2084 > 2085 data summ_lib.All_Bills; > 2086 set sashelp.class (drop=sex height > weight rename=(name=EffectDate age=ITAmnt)); > ____ > ____ > ____ > 180 > 180 > 180 > 2087 if _n_ < 6 then do; > _ > _ > _ > 180 > 180 > 180 > 2088 EffectDate="John"; > _ > _ > _ > 180 > 180 > 180 > 2089 BillNumb=1; > _ > _ > _ > 180 > 180 > 180 > 2090 ITAmnt=7; > _ > _ > _ > 180 > 180 > 180 > 2091 end; > _ > _ > _ > 180 > 180 > 180 > 2092 else if _n_ eq 9 then do; > _ > _ > _ > 180 > 180 > 180 > 2093 EffectDate="Harry"; > _ > _ > _ > 180 > 180 > 180 > 2094 BillNumb=2; > _ > _ > _ > 180 > 180 > 180 > 2095 end; > _ > _ > _ > 180 > 180 > 180 > 2096 else if _n_ < 12 then do; > _ > _ > _ > 180 > 180 > 180 > 2097 EffectDate="Joe"; > _ > _ > _ > 180 > 180 > 180 > 2098 BillNumb=3; > _ > _ > _ > 180 > 180 > 180 > 2099 end; > _ > _ > _ > 180 > 180 > 180 > 2100 else do; > _ > _ > _ > 180 > 180 > 180 > 2101 EffectDate="Mary"; > _ > _ > _ > 180 > 180 > 180 > 2102 BillNumb=4; > _ > _ > _ > 180 > 180 > 180 > 2103 end; > _ > _ > _ > 180 > 180 > 180 > ERROR 180-322: Statement is not valid or it is used out of proper > order. > ERROR 180-322: Statement is not valid or it is used out of proper > order. > ERROR 180-322: Statement is not valid or it is used out of proper > order. > > 2104 run; > > NOTE: The SAS System stopped processing this step because of errors. > > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA > > I assume I do not have "sashelp.class". I spent some time trying to > figure out what it was so I could get it, but found nothing. > > Note that I do not run inside the GUI and indeed have little idea of > how to use it, which may be my problem - not sure. > > > > > > > Your second question is a bit more difficult, thus I'm cc'ing SAS-L so > > that Joe can respond to you directly. As far as I know, comp.soft- > > sys.sas and SA...@LISTSERV.UGA.EDU haven't been on speaking terms for > > almost a year now. I do know that Joe is looking into it but, as yet, > > we're still separated. I don't think that the term "divorce" has been > > used yet! > > > Art > > ----------- > > On Oct 13, 7:42 am, Friar Broccoli <elia...@gmail.com> wrote: > > > > For years I have been looking for a simple way to > > > find all duplicate records with non-matching data. > > > I just tripped over the solution which follows: > > > > VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV > > > > /* Sort for merge */ > > > PROC SORT > > > DATA= SUMM_LIB.All_Bills > > > OUT = SUMM_LIB.All_Bills; > > > BY EffectDate BillNumb ITAmnt; > > > RUN; > > > > *Find all duplicates, then all duplicates with unmatched data in > > > ITAmnt field; > > > DATA SUMM_LIB.All_Duplicates > > > SUMM_LIB.UnMatched_Duplicates; > > > SET SUMM_LIB.All_Bills; > > > BY EffectDate BillNumb ITAmnt; > > > IF ((FIRST.BillNumb ^= 1) OR (LAST.BillNumb ^= 1)) THEN > > > DO; > > > OUTPUT SUMM_LIB.All_Duplicates; > > > IF ((FIRST.ITAmnt = 1) AND (LAST.ITAmnt = 1)) THEN OUTPUT > > > SUMM_LIB.UnMatched_Duplicates; > > > END; > > > RUN; > > > > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA > > > > Can anybody tell me why the SAS group is no longer echoed to and from > > > usenet?- Hide quoted text - > > > - Show quoted text -- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text -


Back to: Top of message | Previous page | Main SAS-L page