Date: Tue, 8 Dec 1998 22:17:30 -0600
Reply-To: Max Martin <mmartin@EDGEWOOD-SA.K12.TX.US>
Sender: "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU>
From: Max Martin <mmartin@EDGEWOOD-SA.K12.TX.US>
Subject: Re: Analysis question (longish)
Content-Type: text/plain; charset="iso-8859-1"
Apologies for the length of this post.
A client of mine has an EXPLORATORY project where (n=151) principals of
elementary, middle, and high schools in three districts (urban, suburban,
and rural) indicate their needs for professional development. The Needs are
grouped into three categories (sets A, B, and C), eight needs per category.
No assumptions are made about the orthogonality of the categories--besides,
the categories themselves have a lot of conceptual simlarities, based on a
reading of the set definitions and items. Each principal is asked to select
the TWO items from each category most similar in orientation to his personal
beliefs. She would like to break down the responses into useful comparisons:
for example, district, educational level (el, mid, HS), or by gender of
recipients, or by age, etc. The data file had a single string variable for
each of the three sets. A value of 25 for set_B would be interepeted as that
the top two choices for the principal were b2 and b5. To facilitate
analysis, I coded each of the three domains into sets of eight binary
variables (a1 to a 8, b1 to b8, and c1 to c8, where a value of 1 meant the
items had been selected and a value of 0 meant that it was not selected.) I
can then cast this into a series of Multiple Response crosstabs and break
out the pecentages of principals selecting (marking a 1) particular needs.
Of course, no tests of statistical significance are possible in the Multiple
My question is this: is it appropriate to cast these data as preferences,
and use MultiDimensional Scaling or Cluster analysis. It seems as if a
choice of one of the eight needs in each set expresses a preference of that
category over 6 others (1 vs. 0). The other category selected forces a no
preference (1 vs 1), while any two non-selected categories also are "no
preference (0 vs. 0). Thus a Proximities matrix could be constructed for
each of the 3 raw needs sets (see below), using binary Euclidean distances,
for example. The resulting Prox. matrix could be cluster analyzed to
identify similar groups of cases or items, or an MDS solution could be
generated and the arrangement of the needs in N-space (2 or 3D) could be
examined. Also, I could use Answer Tree to look at the segmentation patterns
inherent in the data. Am I barking up the wrong tree, or is my proposed
coding ok for generating preference matrices? Any ideas?
The preference matix would look like:
case a1 a2 a3 a4 a5 a6 a7 a8
. . . . . . . . . . . . . . . .
45 1 0 0 0 1 0 0 0
46 0 1 0 1 0 0 0 0
47 0 1 1 0 0 0 0 0
48 0 0 0 0 0 0 1 1
. . . . . . . . . . . . . . . etc.