Date: Thu, 13 May 2004 12:29:50 -0500
Reply-To: "Copeland, Laurel" <Laurel.Copeland@MED.VA.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Copeland, Laurel" <Laurel.Copeland@MED.VA.GOV>
Subject: Re: Dependent variable is ratio of continuous values
Some issues regarding analysis of ratios are addressed in the attached PDF
by Richard Goldstein, which I copied from a Univ Virginia Health System site
no longer in existence.
-Laurel
-----Original Message-----
From: Steve Albert [mailto:salbert@AOL.COM]
Sent: Thursday, May 13, 2004 1:05 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Dependent variable is ratio of continuous values
Talbot,
What is it you're trying to accomplish with this model? What do you want it
to tell you, and what decisions will be made from the results? Are you
looking at this by sales agent, by office, by product line, by month, by
region, by several of those, etc.?
The appropriate analysis depends on your data and what you're trying to do.
Blindly applying a technique without understanding the data, the substantive
issues, and the technique is asking for misleading results.
Your target variable does not, by the way, "take on values between 0 and 1"
-- not unless it's impossible for actual sales to exceed the target, since
the ratio would then exceed 1. Logistic would not be an option at all; the
data simply don't fit the model.
If you can, I'd suggest you consult with an experienced researcher who
understands data, analysis, and business issues. Blindly applying textbook
technique is not the way to go. Understanding the underlying questions, the
available data, how it sheds light on the questions, and what techniques can
be usefully applied, is. If the questions are important enough to be worth
getting the right answers, then get help from someone who can help you do it
right.
Steve Albert
On Tue, 11 May 2004 16:35:37 -0400, Talbot Michael Katz <topkatz@MSN.COM>
wrote:
>Hi.
>
>I have a situation where the target variable I want to model takes on
>values between 0 and 1, endpoints inclusive, because it is a ratio of
>two continuous quantities (e.g., actual sales / sales target)...
>
>Is there a "preferred" method for modeling ratios of continuous
quantities,
>and if so, is it available in SAS?
>
>
>Someone suggested the use of the PROC LOGISTIC events/trials syntax,
>but the SAS documentation for that really seems to stress binary
>outcome trials...
>
>Is it legitimate to use PROC LOGISTIC events/trials for continuous
>numerator and denominator?
>
>
>An econometrics textbook ("Econometric Analysis" by W.H. Greene, 5th
>edition, Prentice-Hall) suggests Minimum Chi-Squared Estimation (or
>MCSE -
-
>don't tell Microsoft!) for proportions; it looks like that discussion
>was motivated by proportions of binary outcomes, but I think the
>equations still work in the case of continuous numerator and
>denominator. However,
a
>search of support.sas.com didn't turn up any procedures that support
>the MCSE methodology. One weakness of MCSE is that the estimation only
>works when the proportion does not take on the extreme values of 0 or
>1. In
such
>cases, it seems that the suggested work-around is to add or subtract a
>small constant...
>
>Is there a SAS procedure that does minimum chi-squared estimation for
>proportions?
>
>
>I could also transform the ratio from the unit interval to the entire
>line with the logit transform, log(y/(1-y)), and then perform a
>standard regression. Again, I'd have to smudge the extreme values
>before
performing
>the transformation...
>
>Is it legitimate to logit transform the ratio after slightly modifying
>the extreme values, and then do OLS?
>
>
>Thanks, as ever, for your indulgence...
>
>-- TMK --
|