Date: Thu, 16 Sep 2010 13:16:36 -0400
Reply-To: oloolo <dynamicpanel@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: oloolo <dynamicpanel@YAHOO.COM>
Subject: Re: Recoding based on frequencies
hash is the way to do:
data one;
input id (v1-v5) ($);
cards;
1 a a a . a
2 a b b . a
3 a c c . a
4 b c d . a
5 b c e . a
6 c c . . a
;
run;
data two;
declare hash h();
h.defineKey('varname', 'varval');
h.defineData('count');
h.defineDone();
length id 3;
array _v{*} $ v1-v5;
do until (eof1);
set one end=eof1;
do j=1 to dim(_v);
varname=vname(_v[j]); varval=vvalue(_v[j]);
if h.find()=0 then do; count+1; rc=h.replace(); end;
else do; count=1; rc=h.add(); end;
end;
end;
do until (eof2);
set one end=eof2;
do j=1 to dim(_v);
varname=vname(_v[j]); varval=vvalue(_v[j]);
if h.find()=0 then _v[j]=count;
else _v[j]='MISS';
end;
output;
drop rc varname varval j count;
end;
run;
On Thu, 16 Sep 2010 09:55:57 -0400, Chang Chung <chang_y_chung@HOTMAIL.COM>
wrote:
>Hi,
>Saw this interesting question posted somewhere else. I tried, but could not
>come up with a neat solution. Can you? Thanks.
>
>"I have a series of categorical variables that I would like to recode based
>on their frequency/count. [...] So, for example, if I had a series of
>records in the variable being a, a, a, b, b, c, I would like to recode my
>variable so that 'a ' (having the highest count) would be coded as 3 and
>'c' (having the lowest count) would be coded as 1. Since I have a series of
>variables it would be hard to recode them manually so was wondering whether
>there was a command to easily do this."
>
>Below is a test data I made up.
>
>/* test data */
>data one;
> input id (v1-v5) ($);
>cards;
>1 a a a . a
>2 a b b . a
>3 a c c . a
>4 b c d . a
>5 b c e . a
>6 c c . . a
>;
>run;
|