LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 25 Jan 2006 11:36:47 -0500
Reply-To:     Pavlo Row <pavlo@INORBIT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Pavlo Row <pavlo@INORBIT.COM>
Subject:      Re: SAS Data Step as prelude to logistic modeling (Conditional
              Logistic Regression ?)
Content-Type: text/plain; charset="iso-8859-1"

Thank you for your comments and the attached sas code Zack. But, just to say, I haven't received any previous comment from you pertaining to this post. I was out for a few days, so I don't know what was going on with SAS-L list. This is the first time I get a note for this post from you. Maybe there was another person with a similar post or maybe you've answered to my previous posts very related to this post. Also, my real data set is as shown below (foo) but it has like 1 million records in it and like 400 numeric fields and almost all fields have some data missing. What I posted below is anly 20 obs as for example only. But, what you say below about using PROC PHREG to perform conditional logistic regression on my data sounds very educational to me. I will study your code and I will repost simce I've never used PROC PHREG before in my life. Again, thank you very much for your input. It is so helpful in terms of enhancing my stat. knowledge. Pavlo

>>>>>> From: "Zack, Matthew M." <mmz1@cdc.gov> To: "Pavlo Row" <pavlo@INORBIT.COM> CC: Subject: RE: SAS Data Step as prelude to logistic modeling Date: Sun, 22 Jan 2006 17:03:57 -0500

As I e-mailed you previously, you can use PROC PHREG to perform conditional logistic regression on your data (cf., attachments). You do not have to create difference variables for this to work.

Because your data has so few observations, and because some of these had missing values, PROC PHREG had difficulty converging if all three variables were specified as independent variables; however, it would converge if only one of these variables were specified as an independent variable. Adding more observations would permit convergence with all three independent variables.

Matthew Zack

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Pavlo Row Sent: Friday, January 20, 2006 3:39 PM To: SAS-L@LISTSERV.UGA.EDU Subject: SAS Data Step as prelude to logistic modeling

Hello All,

Please copy and paste the following example data set. In this example data set I have a MONTH field (MONTH takes on only APRIL & MAY 2005 values), a unique identifier, ID, TARGET variable (takes on NO/YES or 0/1 representing customer response to MAY 2005 campaign). Next I show only three fields out of very many fields like 400 fields or so. I chose three representative fields to show here: # of credit cards customer had in APRIL and MAY 2005, customer bank balance in APRIL and MAY 2005, and the # of months the customer had lived in his house in APRIL & MAY 2005. Again, in the real data set I have many such fields.

data foo; input MONTH $ ID TARGET NUM_CREDIT_CARDS BALANCE MONTHS_RESIDENCE; cards; APR05 1 . 1 57103 24 MAY05 1 0 2 1516 25 APR05 2 . 2 2468 92 MAY05 2 0 1 309 93 APR05 3 . 1 7672 74 MAY05 3 0 0 0 . APR05 4 . 1 53073 127 MAY05 4 1 1 6379 128 APR05 5 . 4 24894 36 MAY05 5 0 0 9859 37 APR05 6 . 0 12 164 MAY05 6 0 1 1699 165 APR05 7 . 2 30248 24 MAY05 7 1 3 1625 25 APR05 8 . 2 45516 345 MAY05 8 1 1 591 346 APR05 9 . 2 5391 216 MAY05 9 0 3 6262 217 APR05 10 . 2 4857 252 MAY05 10 1 0 14457 . ; run;

I use the above example data set to compute additional fields which I call DIFF1 & AVG1, DIFF2 & AVG2, DIFF3 & AVG3, ..., DIFF400 & AVG400 (400 since I have 400 such numeric fields)

If you run the following code it will become clear how I come up with these fields. For DIFF1, for example, I just take the difference between MAY and APRIL and I assign the answer to MAY becasue that's where I have the TARGET variable populated. Later on, once I compute DIFF1 & AVG1, etc., I will delete APRIL and I will be left with only MAY. Then I can build a logistic model based on DIFF1 & AVG1, etc.

data TEST(drop=TEMP1-TEMP3); set foo; by ID; if last.ID then temp1=NUM_CREDIT_CARDS;

DIFF1=temp1-lag(NUM_CREDIT_CARDS);

AVG1=(temp1+lag(NUM_CREDIT_CARDS))/2; if last.ID then temp2=BALANCE; DIFF2=temp2-lag(BALANCE); AVG2=(temp2+lag(BALANCE))/2; if last.ID then temp3=MONTHS_RESIDENCE;

DIFF3=temp3-lag(MONTHS_RESIDENCE);

AVG3=(temp3+lag(MONTHS_RESIDENCE))/2; run;

This is my question: How can I automate the above step? I mean, can you imagine sitting here typing the above lines 400 times. That would be 400 times for DIFF1=temp1-lag(NUM_CREDIT_CARDS); AVG1=(temp1+lag(NUM_CREDIT_CARDS))/2; 400 times for DIFF2=temp2-lag(BALANCE); AVG2=(temp2+lag(BALANCE))/2; . . .

400 times for DIFF400=temp400-lag(BALANCE); AVG400=(temp400+lag(BALANCE))/2;

Again, the answer fields DIFF1=temp1-lag(BALANCE); AVG1=(temp1+lag(BALANCE))/2; and so on will be used in logistic modeling.

Thanks.

P.

-- ___________________________________________________ Play 100s of games for FREE! http://games.mail.com/


Back to: Top of message | Previous page | Main SAS-L page