LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2005, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 14 Feb 2005 11:51:18 -0500
Reply-To:     Susie Li <Susie.Li@TVGUIDE.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Susie Li <Susie.Li@TVGUIDE.COM>
Subject:      Re: Influence of the number of categories on chi-square score
Content-Type: text/plain; charset="iso-8859-1"

To even surmise which functional form of X to put into the intinal linear logistic model (x, x**2, x**3, sqrt(x), 1/x, etc), I rely heavily on SAS scatter plots (no need for SAS Graph). Right now, I break the contiuous X into decile groups, and then plot the log_odds of response by 10 X_decile groups. That's very efficient for discovering the relationship.

The frequency table of the X_decile groups by Y_renewal would give me the chi-square test to test for the X-Y association (hence my question: what's the impact on my chis-square if I break X into 20 groups instead of 10 groups?)

Susie Li TV Guide

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU]On Behalf Of Peter Flom Sent: Monday, February 14, 2005 11:15 AM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: Influence of the number of categories on chi-square score

In terms of finding a model why not logistic for both independent variables? You can add quadratic and cubic terms and see if they are statistically significant (although I would rather base my decision on substantive or graphic results - stat sig depends too much on sample size. Is there a reason to suspect that there will be quad or cubic relationships? Do the graphs (see below) reveal such a thing? See below for more (better???) ideas

For finding the shape of the relationship, looking at plots is always a good idea. I don't know how to do this best in SAS, as I have no access to SAS GRAPH. I do this sort of thing in R. If you have SAS GRAPH, doubtless someone here will be able to advise.

You could plot a smoothed version of the DV to each IV

One thing I also like to do is plot the predicted values for various models against each other - if the differences are substantively large, then the more complex model may be worthwhile, if not, then go with the simpler model.

As a general strategy, the approach based on AIC seems to have much to recommend it. For details, see Burnham and Anderson Model Selection and Multimodel Inference

Briefly: Come up with some (5 or 10 or so) reasonable models, each should be sensible based on SUBSTANTIVE grounds

Go with the one with the lowest AIC

(that's a 3 line description of a nearly 500 page book, so take it with a ton of salt)

Another very good book on regression generally is Harrel Regression modeling strategies.



Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax)

>>> Susie Li <Susie.Li@TVGUIDE.COM> 2/14/2005 11:00:37 AM >>> A typical example of my logistic modeling:

My y dependent variable/binary - customer renewal (yes=renewed, no=not renewed)

My X independent variables/continuous - (1) current pricing structure ($0.25, $0.34,...) (2) the tenure of the customer (how long the customer has been with us, i.e., 1 year, 2 year,...)

I want to know 2 things: (1) the existence of the "association" between X and Y (2) if an association exists, what is the functional form of the association (linear, quadratic or cubic).

I've been using chi-square test for (1), and plot of log_odd versus X for (2).

Susie Li TV Guide 1211 Avenue of the Americas New York, NY 10036 Tel 212.852.7453 Email

Back to: Top of message | Previous page | Main SAS-L page