Date: Thu, 16 Sep 2010 12:56:18 -0400
Reply-To: Ya Huang <ya.huang@AMYLIN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ya Huang <ya.huang@AMYLIN.COM>
Subject: Re: Recoding based on frequencies
Content-Type: text/plain; charset=ISO-8859-1
Ok, based on your intepretation:
proc transpose data=one out=two;
by id;
var v:;
run;
proc sql;
create table three as
select *,count(col1) as freq
from two
where col1 ^=''
group by _name_,col1
order by _name_,freq,col1
;
data four;
set three (where=(freq > 0));
by _name_ freq col1;
if first._name_ then ncode=0;
if first.col1 then ncode+1;
run;
proc sort;
by id;
run;
proc transpose data=four out=five (drop=_name_);
by id;
id _name_;
var ncode;
run;
proc print;
run;
id v1 v2 v3 v5
1 3 1 1 1
2 3 2 2 1
3 3 3 3 1
4 2 3 4 1
5 2 3 5 1
6 1 3 . 1
This way, the tied records are coded differently, based on their
original order.
Ya
On Thu, 16 Sep 2010 11:47:10 -0400, Chang Chung <chang_y_chung@HOTMAIL.COM>
wrote:
>On Thu, 16 Sep 2010 11:28:53 -0400, Ya Huang <ya.huang@AMYLIN.COM> wrote:
>...
>>How about something like this, we don't even need to worry
>>about the type conversion:
>>
>>proc sql;
>>select a.id,v1,v2,v3
>>from (select id, count(v1) as v1 from one group by v1) as a left join
>> (select id, count(v2) as v2 from one group by v2) as b
>> on a.id=b.id left join
>> (select id, count(v3) as v3 from one group by v3) as c
>> on a.id=c.id
>>;
>> id v1 v2 v3
>>ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
>> 1 3 1 1
>> 2 3 1 1
>> 3 3 4 1
>> 4 2 4 1
>> 5 2 4 1
>> 6 1 4 0
>>
>>The problem is we don't know what those counts mapped to
>>from the final data.
>...
>Hi, Ya,
>I don't know for sure. I think it is reasonable to assume that the final
>values should be such that a most frequent value should get recoded into a
>highest number and the least frequent value to the lowest. And I think OP
>wants the highest number to be the same as the number of distinct values in
>the input variable and the lowest 1.
>Cheers,
>Chang
|