Date: Mon, 16 Feb 2004 09:55:52 -0600
Reply-To: pudding man <pudding_man@MAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: pudding man <pudding_man@MAIL.COM>
Subject: Re: wanting to create variables with values in their name for
clusteranalysis
Content-Type: text/plain; charset="iso-8859-1"
It's a bit tedious, but this can be done with a
DATA step or 2 and a PROC TRANSPOSE. Could look
something like:
data aa(keep = id varname varvalue);
input id $ @;
length ex $32;
words = length(_infile_) - length(compress(_infile_, ' ')) + 1;
do i = 2 to words;
ex = scan(_infile_, i);
varname = 'Ex_' || ex;
varvalue = 1;
output;
end;
cards;
ID1 jog walk volleyball basketball
ID2 walk
ID3 basketball soccer
ID4 swim bowl jog walk
ID5 freeweights swim walk
; run;
proc transpose data = aa out = bb(drop = _name_);
var varvalue;
by id;
id varname;
run;
data cc / view = cc;
set bb;
array num _numeric_;
do over num; if num = . then num = 0; end;
run;
proc print data = cc; run;
The TRANSPOSE gets the var names and stores 1's and missing
values. The DATA step view just changes the missing values
to 0's.
Hope it hep's ...
Skoal,
Puddin'
*******************************************************
***** Puddin' Man **** Pudding_Man-at-mail.com ********
*******************************************************;
"Now, I may look like I'm crazy,
but po' John do know right from wrong!"
-from "Drop Down, Mama", Sleepy John Estes
----- Original Message -----
From: jmhjmhjmh <jmhjmhjmh@EXCITE.COM>
Date: Sun, 15 Feb 2004 19:35:10 -0500
To: SAS-L@LISTSERV.UGA.EDU
Subject: wanting to create variables with values in
their name for clusteranalysis
> I am working with a data file that has an ID per person,
and their list of exercises performed during a one week
period. Simplified, it looks something like this:
>
> input id $ ex1 $ ex2 $ ex3 $ ex4 $;
> cards;
> ID1 jog walk volleyball basketball
> ID2 walk
> ID3 basketball soccer
> ID4 swim bowl jog walk
> ID5 freeweights swim walk
> ;
> There are 99 possible names of exercises that could be
noted. There are varying numbers of observations per person,
from 1-~25
>
> I would like to create a datafile that lists all possible
exercises and then for each person, create a logistic, 1 0,
outcome, if they ever did (1) or did not (0) do that
exercise. Renamed variables would be like this: Ex_walk
Ex_job Ex_freeweights ... It would be nice to pass a file of
possible exercises (or better, only those exercises in this
particular data set) through a macro so as not to need to
manually create these new var names AND to populate the 1's
and 0's.
>
> I am then planning on cluster analysis to see what
exercise types are more commonly done, as a group, by people
in this study. I would like the revised data to look like
the following, but if you can suggest other options, that
would be fine:
>
>
> Input ID $ Ex_jog Ex_walk Ex_volleyball Ex_basketball
Ex_soccer Ex_freeweights Ex_swim;
> ID1 1 1 1 1 0 0 0
> ID2 0 1 0 0 0 0 0
> ID3 0 0 0 1 1 0 0
> And so on
>
> Any suggestions, besides the data should have been set up
this way to begin with?
>
> Thanks for the help.
>
> Mark
--
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm