LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 25 Jan 2006 11:58:10 -0500
Reply-To:     Pavlo Row <pavlo@INORBIT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Pavlo Row <pavlo@INORBIT.COM>
Subject:      Re: SAS Data Step as prelude to logistic modeling
Content-Type: text/plain; charset="iso-8859-1"

Hi and thanks for the idea Gerhard. Another responder to this question, Jiann-Shiun (below) showed me this way with a macro like you say here. One idea that has also crossed my mind is if you or Jiann-Shiun or someone else could actually advance the code a step higher in terms of automation: Instead of listing all fields in an array (followed by a loop, of course), would it be possible to grab the field name from the dataset then do the computations DIFF1 & AVG1. When done with this one, then grab the next field name from the data set and compute DIFF2 & AVG2, and so on. This would save additional steps of actually typing (well not actually typing but just a little work needs to be done) the 400 or 1,000 or how ever many field names in the array. P.

>>>>>> From: "Gerhard Hellriegel" <ghellrieg@T-ONLINE.DE> Save Address Block Sender This Is Spam To: SAS-L@LISTSERV.UGA.EDU CC: Subject: Re: SAS Data Step as prelude to logistic modeling Date: Wed, 25 Jan 2006 11:35:55 -0500

Show Full Headers Back To [INBOX]

That seems to be a thing for a macro. I assume, you have already a dataet with the numeric variables you want to use in your calculation. I assume also, you know about the exact number (you can also make that dynamic, but it's easier if you don't need to...)

So the problem is, how I understand that, to repeat things like

DIFF1=temp1-lag(NUM_CREDIT_CARDS); AVG1=(temp1+lag(NUM_CREDIT_CARDS))/2 and others 400 times.

Make a macro:

%macro x;

data ... set .... if.... ...

%do i=1 %to 400; DIFF&i=temp&i-lag(NUM_CREDIT_CARDS); AVG&i=(temp&i+lag(NUM_CREDIT_CARDS))/2; %end;

... run; %mend; %x;

That simply does the repeating and numbering of the variables for you. >>>>>>

On Wed, 25 Jan 2006 11:21:07 -0500, Pavlo Row <pavlo@INORBIT.COM> wrote:

> Thank you so much Jiann-Shiun for your code to my question. You'rwe > very kind. > Pavlo > >>>> > Pavlo: > > The following will give you what you want and the output is at the > end. A macro variable FCount is defined as the number of fields that > you intend to calculate DIFF and AVG. In the example given, it is 3. > . > . > . >>>> > >>>> Pavlo Row <pavlo@INORBIT.COM> 1/20/2006 2:38:43 PM >>> > Hello All, > > Please copy and paste the following example data set. In this example > data set I have a MONTH field (MONTH takes on only APRIL & MAY 2005 > values), a unique identifier, ID, TARGET variable (takes on NO/YES or > 0/1 representing customer response to MAY 2005 campaign). Next I show > only three fields out of very many fields like 400 fields or so. I chose > three representative fields to show here: # of credit cards customer had > in APRIL and MAY 2005, customer bank balance in APRIL and MAY 2005, and > the # of months the customer had lived in his house in APRIL & MAY 2005. > Again, in the real data set I have many such fields. > > data foo; > input MONTH $ ID TARGET NUM_CREDIT_CARDS BALANCE MONTHS_RESIDENCE; > cards; > APR05 1 . 1 57103 24 > MAY05 1 0 2 1516 25 > APR05 2 . 2 2468 92 > MAY05 2 0 1 309 93 > APR05 3 . 1 7672 74 > MAY05 3 0 0 0 . > APR05 4 . 1 53073 127 > MAY05 4 1 1 6379 128 > APR05 5 . 4 24894 36 > MAY05 5 0 0 9859 37 > APR05 6 . 0 12 164 > MAY05 6 0 1 1699 165 > APR05 7 . 2 30248 24 > MAY05 7 1 3 1625 25 > APR05 8 . 2 45516 345 > MAY05 8 1 1 591 346 > APR05 9 . 2 5391 216 > MAY05 9 0 3 6262 217 > APR05 10 . 2 4857 252 > MAY05 10 1 0 14457 . > ; > run; > > > I use the above example data set to compute additional fields which I > call DIFF1 & AVG1, DIFF2 & AVG2, DIFF3 & AVG3, ..., DIFF400 & AVG400 > (400 since I have 400 such numeric fields) > > If you run the following code it will become clear how I come up with > these fields. For DIFF1, for example, I just take the difference between > MAY and APRIL and I assign the answer to MAY becasue that's where I have > the TARGET variable populated. Later on, once I compute DIFF1 & AVG1, > etc., I will delete APRIL and I will be left with only MAY. Then I can > build a logistic model based on DIFF1 & AVG1, etc. > > data TEST(drop=TEMP1-TEMP3); > set foo; > by ID; > if last.ID then temp1=NUM_CREDIT_CARDS; > > DIFF1=temp1-lag(NUM_CREDIT_CARDS); > > AVG1=(temp1+lag(NUM_CREDIT_CARDS))/2; > if last.ID then temp2=BALANCE; > DIFF2=temp2-lag(BALANCE); > AVG2=(temp2+lag(BALANCE))/2; > if last.ID then temp3=MONTHS_RESIDENCE; > > DIFF3=temp3-lag(MONTHS_RESIDENCE); > > AVG3=(temp3+lag(MONTHS_RESIDENCE))/2; > run; > > > This is my question: How can I automate the above step? I mean, can you > imagine sitting here typing the above lines 400 times. That would be 400 > times for > DIFF1=temp1-lag(NUM_CREDIT_CARDS); > AVG1=(temp1+lag(NUM_CREDIT_CARDS))/2; > 400 times for > DIFF2=temp2-lag(BALANCE); > AVG2=(temp2+lag(BALANCE))/2; > . > . > . > > 400 times for > DIFF400=temp400-lag(BALANCE); > AVG400=(temp400+lag(BALANCE))/2; > > > Again, the answer fields > DIFF1=temp1-lag(BALANCE); > AVG1=(temp1+lag(BALANCE))/2; > and so on will be used in logistic modeling. > > Thanks. > > P. > > > -- > ___________________________________________________ > Play 100s of games for FREE! http://games.mail.com/

As AttachmentInline Text MyFolder Previous | Next | Back To [INBOX]

--------------------------------------------------------------------------------

© Copyright 2006 mail.com Corp. All rights reserved.

-- ___________________________________________________ Play 100s of games for FREE! http://games.mail.com/


Back to: Top of message | Previous page | Main SAS-L page