**Date:** Mon, 7 Jul 2003 18:42:27 -0400
**Reply-To:** Jay Weedon <jweedon@EARTHLINK.NET>
**Sender:** "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
**From:** Jay Weedon <jweedon@EARTHLINK.NET>
**Organization:** http://extra.newsguy.com
**Subject:** Re: General (Newbie) Question About Multinomial Logistic
Regression
**Content-Type:** text/plain; charset=us-ascii
On Mon, 7 Jul 2003 20:35:42 +0100, "Sid" <not@home.com> wrote:

>Hi,
>
>I'm trying to get my head round MLR and I have a couple of simple questions
>(I think).
>
>Correct me if I'm wrong but my understanding is we have a set of 'variables
>of influence' and a set of discrete 'Choices', which is fed into the
>software 'back box' which 'optimises' using the logistic function. Say we
>are choosing a brand of car, we can treat each customer individually and
>just have one long input file - with one row for each choice (and
>non-choice) made. We can then estimate the impact of each variable on a
>choice and predict which care a customer will chose given we know the values
>of the influence variables.

Does each customer make only one choice? If not, then you'll need to
consider a "nested" model in which choices made by a particular
customer are mutually correlated.

>But what happens if we wish to predict the outcome within sets of distinct,
>closed groups?
>
>Say, for example, we are looking at a horse race and wish to select the
>winner. When we are training the MLR with the results of previous races, do
>we just treat each horse as we treat our car customer and ignore the other
>horses in the race (just a linear list of all runners in all races), or is
>there some way we need to adjust the input data and/or processing to take
>account of the other horses in each individual race? If so, how do we handle
>the fact that the group sizes (number of runners in each race) can vary?

With this methodology, results of previous races aren't going to be
very helpful unless the same set of horses runs in every race.

If you're trying to predict whether a *particular* horse wins or not,
you have conceptually a simple two-category model, with "other
runners" as a set of possible predictors. If you're trying to predict
*which* horse (of a given set of entered horses) wins, then that'd be
more like a MLR situation.

>There seem to be so many flavours of MLR (ordered, nested etc) the more I
>read the more I'm confused.

"Ordered" means that the categories of the dependent variable have
some natural ordering. In the situation of car models or winning
horses, you don't have ordering.

"Nested" means that the observations are not all mutually independent.
If the same person chooses a car on multiple occasions, nesting is an
appropriate concept.

JW