Date: Mon, 5 Sep 2005 12:46:56 -0400
Reply-To: "Frank J. Gallo" <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Frank J. Gallo" <firstname.lastname@example.org>
Subject: Re: Combining Variable Scores
Content-type: text/plain; charset=US-ASCII
Thank you for the input. The runs worked out well. The following is a sample
of the syntax I used for one group after filtering the data file with an if
condition. I followed the same logic for each group of cases and for all
COMPUTE MLABEL1=MEAN(LABEL1 TO LABEL37).
FREQUENCIES MLABEL1 /format notable/ntiles 4/statistics all.
Because you are familiar with my original question, I am hoping you could
give some advice on graphing my output. I used the point and click approach
to produce an interactive graph. Remember, my syntax skills are green,
especially attempting to use the PLOT command. Below is the syntax the run
produced. I like the presentation of the graph, but I would like to add and
change the following: (a) rather than have the median line displayed, I
would like the mean line displayed, and (b) within the same graph, I would
like to plot the distribution of scores on the "mlabel" (created by the
following - select all cases and then COMPUTE MLABEL=MEAN(LABEL1 TO
LABEL37). I would like to give the reader an opportunity to compare the
MLABEL distribution with MLABEL1, MLABEL2, MLABEL3, and MLABEL4 all in the
same graph. The variable "classid" (label 1-4) identifies the 4 groups of
IGRAPH /VIEWNAME='Boxplot' /X1 = VAR(classid) TYPE = CATEGORICAL /Y = VAR
(mlabel) TYPE = SCALE /COORDINATE = HORIZONTAL /X1LENGTH=3.0 /YLENGTH=3.0
/X2LENGTH=3.0 /CHARTLOOK='NONE' /CATORDER VAR(classid) (ASCENDING VALUES
OMITEMPTY) /SCALERANGE = VAR(mlabel) MIN=1.000000 MAX=6.000000 /BOX OUTLIERS
= ON EXTREME = ON MEDIAN = ON LABEL = N WHISKER = LINE.
From: Frank J. Gallo [mailto:email@example.com]
Sent: Sunday, September 04, 2005 8:55 PM
To: 'Hector Maletta'; 'SPSSX-L@LISTSERV.UGA.EDU'
Subject: RE: Combining Variable Scores
Thanks for the input. The five samples are in the same spss file. I
differentiate them by a variable "classid" which has values 1-5. How would
the syntax change for both combining variables within a sample and combining
variables of all samples?
From: Hector Maletta [mailto:firstname.lastname@example.org]
Sent: Sunday, September 04, 2005 2:37 PM
To: 'Frank J. Gallo'; SPSSX-L@LISTSERV.UGA.EDU
Subject: RE: Combining Variable Scores
So you do not want to combine VARIABLES, you want to combine SAMPLES,
conditional on not being significantly different from each other. And they
should be not significantly different on 37 different variables. So I
totally misunderstood your question. Now:
Adding together your various samples is done by means of the ADD FILES
command, assuming these samples are in different SPSS files all with the
same structure. The syntax is easy:
ADD FILES /FILE 'firstfilenameandpath'/FILE
Before doing this you make sure you have a variable in all the files,
identifying the sample. For instance, a variable called SAMPLE with values 1
to 5 to identify 5 different samples.
An exploratory analysis of the 37 ratings in the whole sample and in each of
the 5 samples can be achieved with the EXAMINE command. You need one for the
whole sample and another for the 5 samples.
EXAMINE LABEL1 TO LABEL37/PERCENTILES 25 50 75.
EXAMINE LABEL1 TO LABEL37 BY SAMPLE/PERCENTILES 25 50 75.
Your ratings are not really interval level variables, so comparing means is
not totally kosher, but it is usual enough for you to risk it without many
qualms. You may compare pairs of samples, or one sample with the total,
through TTEST. To compare each sample mean with the total, and assuming you
already know the value for the total, you may use TTEST for all samples with
SPLIT FILE. Suppose the overall mean rating for one question like LABEL1 is
SPLIT FILE BY sample.
/TESTVAL=2.35 /variable label1.
This would compare all sample means of LABEL1 to the given overall mean of
2.35. In one command you may include more than one variable, provided they
are to be compared to the same overall mean. Use another T-TEST command for
other variables if the overall mean is different.
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]
> On Behalf Of Frank J. Gallo
> Sent: Sunday, September 04, 2005 3:17 PM
> To: SPSSX-L@LISTSERV.UGA.EDU
> Subject: Re: Combining Variable Scores
> Hi Hector,
> Thank you very much for your input. I apologize for not
> giving a better explanation of my situation.
> Respondents rated (1 to 6) the violence severity of 37
> different behaviors.
> I collected 5 samples from the target population. Samples
> were collected at different time points, but within one year.
> I do not expect the distributions of the samples to be
> significantly different -- within the limits of sampling
> errors. However, this is an empirical question. I would like
> to combine the several data sets if (a) they do not show
> large statistical differences between associated
> distributions and (b) I can document other similarities. The
> population parameters are unknown. So, my first step is to
> perform an exploratory data analysis: calculate the mean,
> median stdev, min, max, Q1 and Q3 statistics for each sample.
> Then calculate the same statistics after combining the
> samples. Then do some other procedures such as a graphical
> analysis, analysis of variance, etc. Any further thoughts are
> -----Original Message-----
> From: Hector Maletta [mailto:email@example.com]
> Sent: Sunday, September 04, 2005 12:28 PM
> To: 'Frank J. Gallo'; SPSSX-L@LISTSERV.UGA.EDU
> Subject: RE: Combining Variable Scores
> What do you mean by "combining" the scores? As you know,
> there are different ways to do that. One simple way is just
> obtaining the average or sum of variable scores, but this
> would give all variables the same weight. A more elaborate
> way is factor analysis or some variant of it: if all your
> variables reflect one underlying factor or trait, then the
> first factor extracted should account for a large portion of
> total variance in your 37 variables, and the scores for that
> first factor may be used as a single variable representing
> the main component of the common variance in your variables.
> Once you have your final score for the synthetic variable
> representing your
> 37 original variables, obtaining the summary measures you
> mention is quite easy with the FREQUENCIES or DESCRIPTIVES
> command. In your case FREQUENCIES is better because you want
> the quartiles too.
> To summarize:
> 1. Obtain a single variable representing your 37 scores.
> 1.1. Obtain it as a simple average.
> COMPUTE MEANSCOR=MEAN(LABEL1 TO LABEL37).
> If all your 37 variables use the same scale (say, 1
> to 5) this may be enough. If they have different ranges and
> units, you may better standardize them to have zero mean and
> unit standard deviation. This can be done with the SAVE
> option in the DESCRIPTIVE command, applied BEFORE the
> COMPUTE. The SAVE keyword will create 37 new variables named
> ZLABEL1 to ZLABEL37, which will be the standardized version
> of your variables.
> DESCRIPTIVES LABEL1 TO LABEL37/SAVE.
> COMPUTE MEANSCOR=MEAN(ZLABEL1 TO ZLABEL37).
> 1.2. Obtain it by means of FACTOR ANALYSIS:
> FACTOR VARIABLES LABEL1 TO LABEL37/PRINT ALL/SAVE REG FASCOR.
> This would extract all factors with eigenvalues above
> 1, and would save the scores to the file under new variables
> named FASCOR1 to FASCORk (where k is the last factor
> extracted). In the output look at the VARIANCE EXPLAINED
> table. Judging from the contribution of the first factor to
> explaining all variance in the original variables, you may
> decide whether the contribution of the first factor is much
> larger than the second and later factors, or perhaps your
> variables are in fact measuring two or more different
> underlying factors of similar importance.
> 2. Once you have a single score, say FASCOR1 or MEANSCOR, you
> may know the main statistics by using FREQUENCIES:
> FREQUENCIES FASCOR1 /format notable/ntiles 25/statistics all.
> This would not produce an actual frequency
> distribution (too many values for that), but will give you
> the quartiles and all the summary measures you want (and some more).
> > -----Original Message-----
> > From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]
> On Behalf
> > Of Frank J. Gallo
> > Sent: Sunday, September 04, 2005 11:48 AM
> > To: SPSSX-L@LISTSERV.UGA.EDU
> > Subject: Combining Variable Scores
> > Hi All,
> > Still green at writing syntax, and I am hoping that someone can
> > suggest some syntax for the following run:
> > -- I have 37 variables (label1 - label37)
> > -- sample: n = 50 cases
> > -- I would like to combine the variable scores and then compute the
> > mean, median stdev, min, max, Q1 and Q3 statistics for the sample
> > (n=50).
> > Your help is greatly appreciated.
> > Frank
> > __________ Informacisn de NOD32 1.1208 (20050902) __________
> > Este mensaje ha sido analizado con NOD32 Antivirus System
> > http://www.nod32.com
> __________ Informacisn de NOD32 1.1208 (20050902) __________
> Este mensaje ha sido analizado con NOD32 Antivirus System