They are two different operztions. One combines files, the other computes a
new variable based on existing variables in a given file. In my first answer
to your question I referred to this latter operation, which you better do
not call "combining" to avoid confusion. In SPSS parlance, you MERGE files
and you COMPUTE new variables based on existing variables. Some computations
of new variables arise also as by products of certain statistical analysis,
such as factor scores from factor analysis, or standardized variables from
DESCRIPTIVES.
Hector
> Hi Hector,
>
> Thanks for the input. The five samples are in the same spss
> file. I differentiate them by a variable "classid" which has
> values 15. How would the syntax change for both combining
> variables within a sample and combining variables of all samples?
>
> Thanks,
> Frank
>
> So you do not want to combine VARIABLES, you want to combine
> SAMPLES, conditional on not being significantly different
> from each other. And they should be not significantly
> different on 37 different variables. So I totally
> misunderstood your question. Now:
>
> Adding together your various samples is done by means of the
> ADD FILES command, assuming these samples are in different
> SPSS files all with the same structure. The syntax is easy:
>
> ADD FILES /FILE 'firstfilenameandpath'/FILE
> 'secondfilenameandpath'/..../FILE 'lastfilenameandpath'.
> EXECUTE.
> Before doing this you make sure you have a variable in all
> the files, identifying the sample. For instance, a variable
> called SAMPLE with values 1 to 5 to identify 5 different samples.
>
> An exploratory analysis of the 37 ratings in the whole sample
> and in each of the 5 samples can be achieved with the EXAMINE
> command. You need one for the whole sample and another for
> the 5 samples.
>
> EXAMINE LABEL1 TO LABEL37/PERCENTILES 25 50 75.
> EXAMINE LABEL1 TO LABEL37 BY SAMPLE/PERCENTILES 25 50 75.
>
> Your ratings are not really interval level variables, so
> comparing means is not totally kosher, but it is usual enough
> for you to risk it without many qualms. You may compare pairs
> of samples, or one sample with the total, through TTEST. To
> compare each sample mean with the total, and assuming you
> already know the value for the total, you may use TTEST for
> all samples with SPLIT FILE. Suppose the overall mean rating
> for one question like LABEL1 is 2.35.
>
> SPLIT FILE BY sample.
> TTEST
> /TESTVAL=2.35 /variable label1.
> This would compare all sample means of LABEL1 to the given
> overall mean of 2.35. In one command you may include more
> than one variable, provided they are to be compared to the
> same overall mean. Use another TTEST command for other
> variables if the overall mean is different.
>
> Hector
>
> > Hi Hector,
> >
> > Thank you very much for your input. I apologize for not giving a
> > better explanation of my situation.
> >
> > Respondents rated (1 to 6) the violence severity of 37 different
> > behaviors.
> > I collected 5 samples from the target population. Samples were
> > collected at different time points, but within one year.
> > I do not expect the distributions of the samples to be
> significantly
> > different  within the limits of sampling errors. However,
> this is an
> > empirical question. I would like to combine the several
> data sets if
> > (a) they do not show large statistical differences between
> associated
> > distributions and (b) I can document other similarities. The
> > population parameters are unknown. So, my first step is to
> perform an
> > exploratory data analysis: calculate the mean, median
> stdev, min, max,
> > Q1 and Q3 statistics for each sample.
> > Then calculate the same statistics after combining the
> samples. Then
> > do some other procedures such as a graphical analysis, analysis of
> > variance, etc. Any further thoughts are appreciated.
> >
> > Thanks,
> > Frank
> >
> > Frank,
> > What do you mean by "combining" the scores? As you know, there are
> > different ways to do that. One simple way is just obtaining the
> > average or sum of variable scores, but this would give all
> variables
> > the same weight. A more elaborate way is factor analysis or some
> > variant of it: if all your variables reflect one underlying
> factor or
> > trait, then the first factor extracted should account for a large
> > portion of total variance in your 37 variables, and the scores for
> > that first factor may be used as a single variable representing the
> > main component of the common variance in your variables.
> >
> > Once you have your final score for the synthetic variable
> representing
> > your
> > 37 original variables, obtaining the summary measures you
> mention is
> > quite easy with the FREQUENCIES or DESCRIPTIVES command. In
> your case
> > FREQUENCIES is better because you want the quartiles too.
> >
> > To summarize:
> >
> > 1. Obtain a single variable representing your 37 scores.
> > 1.1. Obtain it as a simple average.
> > COMPUTE MEANSCOR=MEAN(LABEL1 TO LABEL37).
> > If all your 37 variables use the same scale (say, 1
> to 5) this
> > may be enough. If they have different ranges and units, you
> may better
> > standardize them to have zero mean and unit standard
> deviation. This
> > can be done with the SAVE option in the DESCRIPTIVE
> command, applied
> > BEFORE the COMPUTE. The SAVE keyword will create 37 new variables
> > named
> > ZLABEL1 to ZLABEL37, which will be the standardized version of your
> > variables.
> > DESCRIPTIVES LABEL1 TO LABEL37/SAVE.
> > COMPUTE MEANSCOR=MEAN(ZLABEL1 TO ZLABEL37).
> > 1.2. Obtain it by means of FACTOR ANALYSIS:
> > FACTOR VARIABLES LABEL1 TO LABEL37/PRINT ALL/SAVE
> REG FASCOR.
> > This would extract all factors with eigenvalues
> above 1, and
> > would save the scores to the file under new variables named
> FASCOR1 to
> > FASCORk (where k is the last factor extracted). In the
> output look at
> > the VARIANCE EXPLAINED table. Judging from the contribution of the
> > first factor to explaining all variance in the original
> variables, you
> > may decide whether the contribution of the first factor is
> much larger
> > than the second and later factors, or perhaps your variables are in
> > fact measuring two or more different underlying factors of similar
> > importance.
> >
> > 2. Once you have a single score, say FASCOR1 or MEANSCOR,
> you may know
> > the main statistics by using FREQUENCIES:
> >
> > FREQUENCIES FASCOR1 /format notable/ntiles
> 25/statistics all.
> >
> > This would not produce an actual frequency
> distribution (too
> > many values for that), but will give you the quartiles and all the
> > summary measures you want (and some more).
> >
> > Hector
> >
> >
> > > Still green at writing syntax, and I am hoping that someone can
> > > suggest some syntax for the following run:
> > >
> > >
> > >
> > >  I have 37 variables (label1  label37)
> > >
> > >  sample: n = 50 cases
> > >
> > >  I would like to combine the variable scores and then
> compute the
> > > mean, median stdev, min, max, Q1 and Q3 statistics for the sample
> > > (n=50).
> > >
> > >
> > >
> > > Your help is greatly appreciated.
> > >
> > > Frank
> > >
