|Date: ||Fri, 9 Dec 2011 17:13:26 -0500|
|Reply-To: ||Rich Ulrich <firstname.lastname@example.org>|
|Sender: ||"SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>|
|From: ||Rich Ulrich <email@example.com>|
|Subject: ||Re: Survey Analysis Questions|
I'm not Stephen, but see my comments inserted, below.
Date: Fri, 9 Dec 2011 12:45:22 -0600
Subject: Re: Survey Analysis Questions
"Thanks for the quick response. Cronbach's alpha scores on the 5 domains are .807, .907, .956, .959 & .925. Not sure whether you intended for me to examine the questions themselves or the domain averages for normality, but since they are so consistent with their domain, I'm not sure it matters. I have looked at the domain scores and they seem very much right-skewed (agree/strongly agree). "
- The "skew" gets it name from the side with the long tail. For the
order you stated, your scores bunch up on the right, so this is "left-skewed".
"I assume this means that I should be using a chi-square instead of t-test. Please advise me if this is incorrect. This solves the first piece of the analysis I'd like to conduct."
The non-parametric tests are usually tested with something other
than a chi-squared. However, I do not think that many people would
agree with Stephen, where he advocated (rank-based) nonparametric
tests for totals of Likert-type items. If you score the "Totals" as Item-
average scores, you have an immediate set of anchor-labels for
interpreting the means that you observe for various groups. That is
the biggest gain.
These items do fail to meet the Likert ideal, because they do not
have averages near the midpoint.
There is very little loss of power or validity for the tests, when the
variance is relatively restricted by being this sort of sum. However,
here are several alternatives or extensions:
a) Since there are few responses of 1 and 2, create new scores of
"Objecters" by counting the number of responses of 1 and 2 - if you
want to see whether the LOW extremes are particularly important.
b) To create a nicer "interval" basis for the items that are totalled,
rescore the items as 2-5 or 3-5, and obtain totals from those.
c) To preserve the original scoring while creating a nicer interval
basis for the Total, you could subtract each Total from its maximum-
possible (for left-skewed), and take the square root. The scoring
now will run in the opposite direction, so it will be useful to apply
the actual labels, as transformed, to keep track of what you are
"For each child for whom I have a survey, I will also have the # of team meetings attended by a parent and the total # of team meetings, so I will have the % of team meetings attended by a child."
Unless there are always a lot of meetings, that % will not be very
attractive as a score. But you do need to figure out what will make
a useful criterion. What is logical? What will your audience see as
"1) How can I test to see which of the 5 domains is the best predictor of whether parents attend team meetings for all respondents?"
Please use the term "best correlate" and please consider why and
how a correlation can result from "common causation". If, as you
say, this is a "satisfaction survey", it is pretty hard to construe the
satisfaction as actually "predicting" the attendance for the previous
You can look at scatterplots of scores, and simple correlations, if you
are considering two continuous scores. Your display may be more
important that your test -- These are observational data, so you do
need to tell a convincing story about how much association you see,
and why it matters.
"2) How can I break the respondents by group (race, sex, program) to see what domain is the best predictor of whether parents attend team meetings?"
Again, "correlation" and not "prediction". Even though you don't have
the timing problem for race and sex, "Correlation is not causation."
What is the "average attendance score" for males vs. females?
"Dave -- you are correct, this is not a random sample. The entire population of those enrolled in the program received a survey
(n=268) and I received 86 responses. I dropped 28 youth due to bad
addresses for a response rate of 86/240 ~ 35.8%. What do you suggest in analyzing the differences between groups?"
Since you have the data on hand, you certainly should do your basic
comparisons of 86 Responders vs. 28 Dropped-responders vs. 154
non-responders, for whatever data (Attendance, only?) in the complete file.