Date: Thu, 15 Jun 2006 17:26:14 0400
ReplyTo: Lou <charl_bean@YAHOO.CO.UK>
Sender: "SPSSX(r) Discussion" <SPSSXL@LISTSERV.UGA.EDU>
From: Lou <charl_bean@YAHOO.CO.UK>
Subject: Re: Logistic regression help
ContentType: text/plain; charset=ISO88591
Hi Lucinda,
Thanks very much for your response. You have certainly helped me to think
more clearly about the issues surrounding this problem and I'll be re
reading your reply in order to help me fathom out what's going on with
this data.
Thanks again,
Lou
On Thu, 15 Jun 2006 13:06:37 0700, LUCINDA M TEAR <lucindatear@msn.com>
wrote:
>Hi, Lou. I agree with all that Keith has said. I might add that the non
>significant interaction using categorical variables could be due either to
>the fact that by lumping together the Y responses over a range of X inputs
>you created a categorical variable whose variance is large enough that it
is
>not possible to detect any interaction and/or that the endpoints of the
bin
>categories you are using occur at points in the data that obscure the
>interaction you found using the continuous data.
>
>In some cases, it may actually serve you to have a model without an
>interaction effect  it is possible, however, that the confidence
intervals
>around such a model will be larger than they would be from a model with an
>interaction. On the other hand, using the continuous data apparently
>allowed you to detect some underlying "process" (the interaction you
found).
>If you are trying to understand what creates the patterns you see in your
>data, both models give you information about the resolution at which
certain
>processes are revealed or obscured. Apparently lumping the way you have
>obscures the interaction. You might want to try binning your x variables
>differently than the previous report did, just to see if there is a way to
>categorize the x variables such that an interaction is detected. You
could
>probably use plots from your continuous model to give you an idea about
>where appropriate bins thresholds might lie. I tend to be one who likes
to
>use models as a way of revealing the "scale" at which the data should be
>approached in order to answer the question at hand. A different question
>about the same data could require a different type of model. Models also
>help you find out if the scale you are looking at is missing information
>about some underlying effects that could effect the application of the
>results.
>
>Just some thoughts.
>
>Lucinda
>
>
>
> Original Message 
>From: "Statisticsdoc" <statisticsdoc@cox.net>
>Newsgroups: bit.listserv.spssxl
>To: <SPSSXL@LISTSERV.UGA.EDU>
>Sent: Thursday, June 15, 2006 12:36 PM
>Subject: Re: Logistic regression help
>
>
>> Keith Starborn
>> www.statisticsdoc.com
>>
>> Lou,
>>
>> I bet most of the people on this listerserv have faced a similar dilemma
>> at some time in their careers. Which one is best from the point of view
>> of using the data to answer your questions and generate information that
>> you can act on? Probably, keeping the variables continuous is better
from
>> that point of view.
>>
>> As to the politics of the situation, in your position, I would run the
>> analyses both ways (continuous and categorized) in order to: a.) show
that
>> I did the analysis the way I was told to; and b.) found something else
>> that works better. You know the situation best of all.
>>
>> HTH,
>>
>> KS
>>
>>  Lou <charl_bean@YAHOO.CO.UK> wrote:
>> > Dear Keith,
>> >
>> > Thanks for your advice which was very helpful. I feel a bit stuck as
to
>> > know what to do about this really. My boss (who knows rougly zero
about
>> > statistics) is insisting that I categorise these variables since I am
>> > comparing results with a previous report which did the same. Does it
>> > take
>> > meaning away from the analysis if I discuss results obtained using the
>> > original continuous variables and then discuss results separately
using
>> > the categorised versions (i.e. generate two separate models)? Not
sure
>> > if
>> > this really defies logic too much and how I would justify this in the
>> > final report. Although I have a lot to learn in this field, the
report
>> > that this work is being based on has a lot of dubious findings with
>> > regards to the stats, so I'm very keen to ensure that the one I
produce
>> > is
>> > accurate!!
>> >
>> > Many thanks,
>> >
>> > Lou
>> >
>> > On Thu, 15 Jun 2006 11:36:45 0400, Statisticsdoc
>> > <statisticsdoc@cox.net>
>> > wrote:
>> >
>> > >Keith Starborn
>> > >www.statisticsdoc.com
>> > >
>> > >Dear Lou,
>> > >
>> > >Categorizing continuous variables into categorical variables can
result
>> > is a considerable loss of statistical power because the test for the
>> > categorized version of the variable uses more degrees of freedom that
>> > the
>> > test for the continuous variable. In addition, categorizing a
>> > continuous
>> > variable can result in a loss of predictive information.
>> > >
>> > >HTH,
>> > >
>> > >KS
>> > >
>> > > Lou <charl_bean@YAHOO.CO.UK> wrote:
>> > >> Dear list
>> > >>
>> > >> I am trying to carry out a logistic regression analysis and have a
>> > >> quick
>> > >> question with regards to the best way to input my independent
>> > >> variables.
>> > >> I have three input variables: ethnicity (5 groups), age and
>> > >> deprivation
>> > >> score. Although age and deprivation score are continuous
variables,
>> > >> I
>> > >> have also been asked to split them into groups (4 for age and 5 for
>> > >> deprivation) which are predetermined by previous work on this
>> > >> subject
>> > >> matter. The dependent variable is simply whether or not a person
>> > >> took a
>> > >> particular test.
>> > >>
>> > >> I have tried generating models both with the age and deprivation
>> > variables
>> > >> as they are and also with the new categorical age and deprivation
>> > >> variables. However, when looking at interaction terms, I find that
>> > >> the
>> > >> interaction between age and deprivation is significant when they
are
>> > input
>> > >> as the continuous variables but not significant when I used the
>> > >> categorical versions. Why would this happen? Furthermore, which
is
>> > >> the
>> > >> best way to go? I have read information on logistic regression
until
>> > >> my
>> > >> head hurts, but still donÂt feel completely satisfied as to how
I
>> > should
>> > >> determine the best model possible.
>> > >>
>> > >> Any advice would be appreciated please!
>> > >>
>> > >> Thanks
>> > >>
>> > >> Lou
>> > >
>> > >
>> > >For personalized and experienced consulting in statistics and
research
>> > design, visit www.statisticsdoc.com
>>
>> 
>> For personalized and experienced consulting in statistics and research
>> design, visit www.statisticsdoc.com
>>
