Date: Wed, 25 Jan 2006 11:35:55 -0500
Reply-To: Gerhard Hellriegel <ghellrieg@T-ONLINE.DE>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Gerhard Hellriegel <ghellrieg@T-ONLINE.DE>
Subject: Re: SAS Data Step as prelude to logistic modeling
That seems to be a thing for a macro.
I assume, you have already a dataet with the numeric variables you want to
use in your calculation.
I assume also, you know about the exact number (you can also make that
dynamic, but it's easier if you don't need to...)
So the problem is, how I understand that, to repeat things like
and others 400 times.
Make a macro:
%do i=1 %to 400;
That simply does the repeating and numbering of the variables for you.
On Wed, 25 Jan 2006 11:21:07 -0500, Pavlo Row <pavlo@INORBIT.COM> wrote:
>Thank you so much Jiann-Shiun for your code to my question. You'rwe very kind.
> The following will give you what you want and the output is at the
>end. A macro variable FCount is defined as the number of fields that
>you intend to calculate DIFF and AVG. In the example given, it is 3.
>>>> Pavlo Row <pavlo@INORBIT.COM> 1/20/2006 2:38:43 PM >>>
>Please copy and paste the following example data set. In this example
>data set I have a MONTH field (MONTH takes on only APRIL & MAY 2005
>values), a unique identifier, ID, TARGET variable (takes on NO/YES or
>0/1 representing customer response to MAY 2005 campaign). Next I show
>only three fields out of very many fields like 400 fields or so. I chose
>three representative fields to show here: # of credit cards customer had
>in APRIL and MAY 2005, customer bank balance in APRIL and MAY 2005, and
>the # of months the customer had lived in his house in APRIL & MAY 2005.
>Again, in the real data set I have many such fields.
>input MONTH $ ID TARGET NUM_CREDIT_CARDS BALANCE MONTHS_RESIDENCE;
>APR05 1 . 1 57103 24
>MAY05 1 0 2 1516 25
>APR05 2 . 2 2468 92
>MAY05 2 0 1 309 93
>APR05 3 . 1 7672 74
>MAY05 3 0 0 0 .
>APR05 4 . 1 53073 127
>MAY05 4 1 1 6379 128
>APR05 5 . 4 24894 36
>MAY05 5 0 0 9859 37
>APR05 6 . 0 12 164
>MAY05 6 0 1 1699 165
>APR05 7 . 2 30248 24
>MAY05 7 1 3 1625 25
>APR05 8 . 2 45516 345
>MAY05 8 1 1 591 346
>APR05 9 . 2 5391 216
>MAY05 9 0 3 6262 217
>APR05 10 . 2 4857 252
>MAY05 10 1 0 14457 .
>I use the above example data set to compute additional fields which I
>call DIFF1 & AVG1, DIFF2 & AVG2, DIFF3 & AVG3, ..., DIFF400 & AVG400
>(400 since I have 400 such numeric fields)
>If you run the following code it will become clear how I come up with
>these fields. For DIFF1, for example, I just take the difference between
>MAY and APRIL and I assign the answer to MAY becasue that's where I have
>the TARGET variable populated. Later on, once I compute DIFF1 & AVG1,
>etc., I will delete APRIL and I will be left with only MAY. Then I can
>build a logistic model based on DIFF1 & AVG1, etc.
> if last.ID then temp1=NUM_CREDIT_CARDS;
> if last.ID then temp2=BALANCE;
> if last.ID then temp3=MONTHS_RESIDENCE;
>This is my question: How can I automate the above step? I mean, can you
>imagine sitting here typing the above lines 400 times. That would be 400
>400 times for
>400 times for
>Again, the answer fields
>and so on will be used in logistic modeling.
>Play 100s of games for FREE! http://games.mail.com/