Date: Thu, 10 May 2007 15:13:16 -0700
Reply-To: Scott Bucher <bucher_scott@YAHOO.COM>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Scott Bucher <bucher_scott@YAHOO.COM>
Subject: Stats in administrative settings
Content-Type: text/plain; charset=iso-8859-1
Many thanks to everyone for their helpful responses to my recent question regarding "Statistical significance without sampling?". Assuming we take the position that this 'population' is actually a sample, I have a related question, which begins to venture from the "*whether* to *how*":
What are best practices for applying tests of statistical significance within an administrative rather than an academic or research setting? Any resources that can be recommended for this issue?
To give a concrete example, co-workers simply want to get a sense of if the following changes in retention are statistically significant or due to random fluctuations: 83.4 (1998) 83.3(1999) 86.2(2000) 85.4(2001) 87.5(2001) 86.9(2002) 85.8(2003) 81.7(2004). Each cohort's n = 1100.
In this situation I have said that "a 2% or greater change in unlikely to have been due to mere chance, and is due at least in part to a change in one of the underlying variables that influence retention." I have based this on the fact that in this situation a 2% difference will produce Pearsonís Chi-Square Asymp. Sig (2-sided) value of .337. Is this a sensible approach?
Any other ideas for dealing with reporting stats in administrative settings; i.e. (1) the consumers have little or no knowledge of statistics (not that I am an expert myself), (2) there is little time that can be put into producing the report, (3) consumers are so rushed that they have little time to look at reports with any depth....In other words, I am looking for something workable and simple given the circumstances. However, am also interested in how to 'do this the right way', assuming these constraints did not exist.
Statisticsdoc <email@example.com> wrote: Richard, as usual, is spot on here. In order to conduct significance tests,
one has to work within an inferential framework, and decide that students in
specific years do not exhaust the population of interest.
For personalized and professional consultation in statistics and research
From: Richard Ristow [mailto:firstname.lastname@example.org]
Sent: Tuesday, May 08, 2007 4:49 PM
To: SB; SPSSX-L@LISTSERV.UGA.EDU
Cc: Statisticsdoc; rich reeves
Subject: Re: Statistical significance without sampling?
At 01:11 PM 5/7/2007, SB wrote:
>I have recently produced a report of student retention rates at a
>university. I have data on all students, hence no sampling has taken
>place. However, I have been asked if a change in student retention
>rates from one year to the next is statistically significance. To me,
>this question does not make sense, as statistical significance is a
>measurement of the probability that a sample is representative of a
>population; and in this case we have information on all students. Am I
>missing something? Can in be meaningful to test for statistical
>significance in this situation?
If I understand correctly, rich reeves is addressing a different
question: whether the test is commonly done correctly, given that you
accept it can be done at all.
The question you ask is a recurring one and a fairly deep one, so it's
worth an occasional revisit. You may want to look at thread
"Significant difference - Means", on this list Tue, 19 Dec 2006
<09:05:53 +0100> ff., a discussion that went over many issues including
Your data can be regarded from two points of view, both of which are
legitimate but which have different implications. I can only say,
firmly, you should know what point of view you are taking; what you
think it means; and why its implications for inferential analysis are,
what they are.
One point of view is, "[we] have data on all students, hence no
sampling has taken place." Here, you're taking your universe of
discourse as the students at your university. Then, there is no
question of comparing using inferential statistics. You have the exact
values (presumably) for this year and last year; and their difference
is, definitively, whatever it is.
But another point of view is to regard each year's experience as a
(multi-dimensional) sample point, in a space of possible experience.
That is, the set of students enrolled each year is a sample, subject to
random fluctuations, from the population that might be considered for
enrollment. Their experiences at the school are a sample, subject to
random fluctuations, from the possible experiences and happenings to
students at the school. And the outcome, retention or not, is
influenced by these factors with random elements. So the outcome
becomes a measure subject to random influence, and a legitimate subject
for inferential statistics.
If you take this view (it's one I have a good deal of sympathy for),
you must remember you are not comparing this year and last year as
exact experiences, but as samples in the probability space of likely
experience, given conditions bearing on the students of each year.
Then, of course, once you've decided this is a legitimate subject for
inferential statistical analysis, you have to get into methodology -
what I take to be rich reeves's questions. Among other things, your
enrolled students are certainly not a sample selected at random
equi-probably from a definable population of candidates. But here we
get from *whether* to *how*, and many others can do better than I, for
TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.