Date: Sun, 10 May 2009 18:29:03 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: How to preserve all values that are duplicates,
including the first (non-duplicate) occurence
In-Reply-To: <eb1119f6-8802-42ab-9122-369f79a77c0b@j12g2000vbl.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"
Anaconda:
While the 'L has given you many solutions from which to choose, I don't see why one would display multiples of a duplicated value. This basic query displays the same information in a more concise form.
proc sql;
select text,count(*) as n
from one
group by text
having count(*) > 1;
quit;
As you have stated the problem, you are looking to display each instance of a duplicated value in an attribute of a dataset. What purpose could you have for doing that if you cannot distinguish one value from another? Say you are looking for duplicates among primary key attribute values in a list that you plan to insert into a database table. Simply to locate the duplicates, you will have to include other information when you display them.
For that purpose, you may find use for a SAS Macro that I've written as a substitute for the undocumented and unsupported MONONIC() function:
%MACRO monomagic(DSN,out,seq=row,order=descending,top=);
/* SWH 5/10/2009 */
%IF (&top=) %THEN; %ELSE
%DO;
proc sql;
create view VW&DSN as
select * from &DSN
order by x &order
;
quit;
%END;
data &out;
set %IF (&top=) %THEN %DO; &DSN %END;
%ELSE %DO; VW&DSN %END;
;
&seq=_N_;
%IF (&top=) %THEN;
%ELSE %DO; IF &seq <= ⊤ %END;
run;
%MEND monomagic;
Use this scripting Macro (parameter values as below) to make specific the ordering of rows in dataset one. %monomagic with different parameter values also finds the top (or bottom) n values of an attribute.
_______________________
Extending your example:
data one;
infile datalines;
input text $10.;
datalines;
hello
heeelo
hello
heydy
heLLO
HELlo
hello
;
run;
/* This gives the distinct duplicate value
hello
*/
proc sql;
select text
from one
group by text
having count(*) > 1;
quit;
%monomagic(one,another,order=ascending)
proc sql;
select row,text
from another
group by text
having count(*) > 1
order by row;
quit;
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Anaconda
Sent: Saturday, May 09, 2009 10:34 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: How to preserve all values that are duplicates, including the first (non-duplicate) occurence
How do I take care of all datavalues of values that are duplicates, including the one that is not a duplicate, but a primary key-value? How do I take care of all three 'hello' in dataset ONE?
data one;
infile datalines;
input text $10.;
datalines;
hello
heeelo
hello
heydy
heLLO
HELlo
hello
;
run;
/* This gives the distinct duplicate value
hello
*/
proc sql;
select text
from one
group by text
having count(*) > 1;
quit;
/* And this gives the distinct non-duplicate values */
proc sort data = one out = two nodup;
by text;
run;
/* What I want is the non-distinct duplicate values. How?
Like this:
hello
hello
hello
*/
Regards
Anaconda