=========================================================================
Date: Wed, 26 Jul 2006 09:40:44 +0200
Reply-To: a.smulders@beke.nl
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Antoon Smulders <a.smulders@beke.nl>
Subject: Re: Randomly Sort a variable
In-Reply-To: <FABE7BF4C8A9B647A083DCEB99A792CF2079D9@beke01.beke.local>
Content-Type: text/plain; charset="iso-8859-1"
Hi Marta and Vishal (and others)
I am not so familiar with the MATRIX statement, but it can be easily done
without it. Just create a file with only the variable to be randomly sorted
and follow the procedure that Vishal suggested, then MATCH the file with the
original file.
As an example (V1 is the variable to be "shuffled").
GET FILE "s:\test.sav".
MATCH FILES FILE * /KEEP v1.
* the following is some aritrary random function:
COMPUTE rn = rv.normal(1, 10).
SORT CASES BY rn.
MATCH FILES FILE * /RENAME v1 = RandomCopyV1 /KEEP RandomCopyV1.
MATCH FILES FILE * /FILE "s:\test.sav".
Antoon Smulders
-----Oorspronkelijk bericht-----
Van: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] Namens Marta
García-Granero
Verzonden: dinsdag 25 juli 2006 20:19
Aan: SPSSX-L@LISTSERV.UGA.EDU
Onderwerp: Re: Randomly Sort a variable
Hi Vishal
VL> For a paper we are working on, I need to randomly sort data from one
VL> column/variable into another. So, if the data in the mother
VL> column/variable is 2, 7, 5, 9, 4, the data in the next column needs to
be
VL> randomly sorted like say 4, 5, 9, 7, 2. One way to do this is to create
a
VL> random id variable and sort based on that but the problem is that "Sort"
VL> will change the sequence of all cases not just the column of interest. I
VL> could copy and paste but this will take an inordinately long time for
the
VL> 10,000 random columns we have planned. I would greatly appreciate any
VL> suggestions.
This task is easy with MATRIX.
* Tiny example dataset *.
DATA LIST FREE/var1 (F8).
BEGIN DATA.
2 7 5 9 4
END DATA.
MATRIX.
GET data /VAR=var1.
COMPUTE randoms=UNIFORM(NROW(data),1).
COMPUTE sdata=data.
COMPUTE sdata(GRADE(randoms))=data.
COMPUTE vname={'shuffled'}.
SAVE sdata /OUTFILE='C:\Temp\RandomlyShuffledData.sav'/NAMES=vname.
END MATRIX.
MATCH FILES /FILE=* /FILE='C:\Temp\\RandomlyShuffledData.sav'.
LIST.
Ìf you plan to do that for a lot of variables (10,000!) then the
wisest thing to do would be turn this code into a simple MACRO that
loops thru the whole file. If you need help to do that, please do not
hesitate to ask me.
--
Regards,
Dr. Marta García-Granero,PhD mailto:biostatistics@terra.es
Statistician
-----------------------------------------------------------------------
"It is unwise to use a statistical procedure whose use one does
not understand. SPSS syntax guide cannot supply this knowledge, and it
is certainly no substitute for the basic understanding of statistics
and statistical thinking that is essential for the wise choice of
methods and the correct interpretation of their results".