LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2006)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 27 Sep 2006 14:38:26 -0500
Reply-To:     "Peck, Jon" <>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         "Peck, Jon" <>
Subject:      Re: Computing the median value of a group of variables
In-Reply-To:  A<>
Content-Type: text/plain; charset="UTF-8"

With SPSS 15, you have the ability, in essence, to add your own functions to SPSS transformations via the programmability functionality. There are helper functions that are part of the Bonus Pack for early adopters that will become generally available in November.

A problem like the casewise median of several variables can be solved very easily with this mechanism.

First, here is a little Python function that calculates a median. Its argument is a list of values. First it screens out missing values; then it sorts and returns the middle element or the average of the two middle elements if the number of variables is even.

def median(lis): lisnomv = [item for item in lis if not item is None] lisnomv.sort() s = len(lisnomv) if s == 0: return None return (lisnomv[(s-1)/2] + lisnomv[s/2])/2

It would then be used like this, as an example.

begin program. include spss, trans

<insert the median def here.

t = trans.Tfunction() t.append(median, "resultvar", "f", [<your list of variables>]) < as many other functions as you like> t.execute() end program.

This will loop over the cases and create a new variable that is the median of the variables listed for each case.

Regards, Jon Peck SPSS

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Marta García-Granero Sent: Wednesday, September 27, 2006 1:49 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: [SPSSX-L] Computing the median value of a group of variables

Hi again:

Ok, dinner waited for 10 minutes. Try this modified code (checked by flipping the dataset and asking for frequencies command with median).

* Sample dataset *. PRESERVE. * Just the avoid the annoying warning concerning those missing data *. SET ERRORS=NONE. DATA LIST LIST/v1 TO v10 (10 F8). BEGIN DATA 1 1 3 5 1 6 3 . 9 5 2 3 1 5 7 4 9 7 8 3 4 5 3 6 . 8 1 4 3 9 END DATA. RESTORE. COMPUTE id=$casenum.

* Important step! *. COUNT nmiss = v1 TO v10 (SYSMIS) .

MATRIX. * Replace by your 100 variables names *. GET data /VAR=V1 TO v10 /MISSING=ACCEPT /SYSMIS=1E6. GET nmiss /VAR=nmiss. COMPUTE n=NROW(data). COMPUTE k=NCOL(data). COMPUTE validn=k-nmiss. COMPUTE ranked=MAKE(n,k,0). COMPUTE sorted=MAKE(n,k,0). COMPUTE medians=MAKE(n,1,0). LOOP i=1 TO n. - COMPUTE ranked(i,:)=GRADE(data(i,:)). - COMPUTE sorted(i,ranked(i,:))=data(i,:). - DO IF TRUNC(validn(i)/2) EQ (validn(i)/2). /* Median for even sample sizes *. - COMPUTE medians(i)=(sorted(i,validn(i)/2)+sorted(i,(1+validn(i)/2)))/2. - ELSE. /* Median for odd samples *. - COMPUTE medians(i)=sorted(i,(validn(i)+1)/2). - END IF. END LOOP. COMPUTE id={T(1:n)}. COMPUTE namevec={'Medians','id'}. SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. END MATRIX.

MATCH FILES /FILE=* /FILE='C:\Temp\Medians.sav' /BY id. EXE. DELETE VARIABLES id nmiss.

Wednesday, September 27, 2006, 8:13:05 PM, You wrote:

JT> Hi Marta,

JT> Many thanks for your help - I was relieved to find there was a way JT> around this that didn't involve flipping the dataset.

JT> I tried your code, replacing the line:

JT> GET data /VAR=V1 TO v10.

JT> with:

JT> GET data /VAR=b1_001 TO b1_100. (as my variables are labelled)

JT> but encountered errors when I ran the code (see below). Do I need JT> to further modify your code? I'm afraid I'm a novice and generally JT> use the drop down menus, etc rather than coding by hand so I'm JT> probably missing something very obvious!

JT> Best wishes,

JT> Jennifer

JT> Errors from SPSS output (first) JT> Run MATRIX procedure: >>Error encountered in source line # 43

>>Error # 12555 >>During execution of the GET statement, missing value has been >>encountered, but no MISSING subcommand is specified. >>This command not executed.

JT> On 9/27/06, Marta García-Granero <> wrote:Hi JT> Jennifer

JT> No need to tamper your dataset with FLIPs AGGREGATEs and other nasty JT> transformations.

JT> I did not remember I had answered this same question some time ago. JT> You can easily adapt this MATRIX code to your needs (even turning it JT> to a MACRO with the list of variable names and output variable name JT> as JT> arguments):

JT> * Sample dataset (only 10 variables instead of 100) *. JT> DATA LIST LIST/v1 TO v10 (10 F8). JT> BEGIN DATA JT> 1 1 3 5 1 6 3 7 9 5 JT> 2 3 1 5 7 4 9 7 8 3 JT> 4 5 3 6 7 8 1 4 3 9 JT> END DATA.

JT> * This variable is needed for correct matching later *. JT> COMPUTE id=$casenum.

JT> MATRIX. JT> * Replace "V1 TO V10" by your 100 variables names *. JT> GET data /VAR=V1 TO v10. JT> COMPUTE n=NROW(data). JT> COMPUTE k=NCOL(data). JT> COMPUTE ranked=MAKE(n,k,0). JT> COMPUTE sorted=MAKE(n,k,0). JT> COMPUTE medians=MAKE(n,1,0). JT> LOOP i=1 TO n. JT> - COMPUTE ranked(i,:)=GRADE(data(i,:)). JT> - COMPUTE sorted(i,ranked(i,:))=data(i,:). JT> - COMPUTE medians(i)=(sorted(i,k/2)+sorted(i,(1+k/2)))/2. JT> END LOOP. JT> COMPUTE id={T(1:n)}. JT> COMPUTE namevec={'Medians','id'}. JT> SAVE {medians,id} /OUTFILE='C:\Temp\Medians.sav' /NAMES=namevec. JT> PRINT /TITLE='Medians have been computed and saved to C:\Temp\Medians.sav'. JT> END MATRIX.

JT> MATCH FILES /FILE=* JT> /FILE='C:\Temp\Medians.sav' JT> /BY id. JT> EXE. /* This execute is needed for next command *. JT> DELETE VARIABLES id.

JT>> Could anyone tell me if the compute function can be used to work JT>> out the median value of a group of variables? I can't seem to find JT>> the correct command in the 'compute variables' window. I have a JT>> datafile with just 42 cases but there are 4 sets of 100 variables JT>> that represent consecutive reaction time responses.

-- Regards, Dr. Marta García-Granero,PhD Statistician

--- "It is unwise to use a statistical procedure whose use one does not understand. SPSS syntax guide cannot supply this knowledge, and it is certainly no substitute for the basic understanding of statistics and statistical thinking that is essential for the wise choice of methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)

Back to: Top of message | Previous page | Main SPSSX-L page