Date: Fri, 16 Sep 2005 18:01:26 -0400
Reply-To: Richard Ristow <wrristow@mindspring.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Richard Ristow <wrristow@mindspring.com>
Subject: Suggestion: Group-contiguous vectors
Content-Type: text/plain; charset="us-ascii"; format=flowed
As we all know, the elements of a vector must be contiguous in the
file. (And I'll skip saying anything about SAS arrays, whose elements
can be completely arbitrary sets of variables.)
I expect that SPSS vectors work the way they do because implementation
is simpler. A SAS array probably takes a full array of pointers, one
pointing to each of the variables. An SPSS vector probably takes two
numbers: the location within the data record of the first element, and
the number of elements. Plus the element length, in case the vector
elements are strings longer than 8 characters. (So you shouldn't be
able to have a vector of strings of different lengths; and sure enough,
you can't. From the syntax manual: "A single vector must comprise all
numeric variables or all string variables. The string variables must
have the same length.")
But here's something almost as simple, that would be very useful: what
I'll call "group-contiguous" vectors.
A common instance is for a record to contain a repeating sequence, each
member the same length, of groups of logically related variables. The
members of a group may well be of different types, but corresponding
members of different groups will be of the same type. Like this:
DISPLAY VARIABLES.
List of variables on the working file
Name Pos Level Print Fmt Write Fmt Missing Values
SUBJECT 1 Scale N3 N3
ALPH2001 2 Scale F3 F3
BETA2001 3 Scale F3 F3
GAMM2001 4 Scale F3 F3
ALPH2002 5 Scale F3 F3
BETA2002 6 Scale F3 F3
GAMM2002 7 Scale F3 F3
ALPH2003 8 Scale F3 F3
BETA2003 9 Scale F3 F3
GAMM2003 10 Scale F3 F3
SPACER 11 Nominal A39 A39
(In some systems, the triplet alpha-beta-gamma would be called a
"user-defined data type.)
So here's a suggested extension to VECTOR:
VECTOR ALPHA,BETA,GAMMA=ALPH2001 TO GAMM2003.
Then ALPHA, BETA, and GAMMA are interleaved vectors, like this:
Name Vector element
ALPH2001 ALPHA(1)
BETA2001 BETA(1)
GAMM2001 GAMMA(1)
ALPH2002 ALPHA(2)
BETA2002 BETA(2)
GAMM2002 GAMMA(2)
ALPH2003 ALPHA(3)
BETA2003 BETA(3)
GAMM2003 GAMMA(3)
All elements of ALPHA, of BETA and of GAMMA would have to be the same
type and length; but ALPHA, BETA and GAMMA elements would not.
This should be almost as simple to implement: the vectors ALPHA, BETA
and GAMMA could still be represented internally as a starting point, a
number of elements, and the distance (in 8-byte units) between
consecutive elements.