Date: Thu, 11 Feb 2010 17:17:37 +0100
Reply-To: Derek Willemsen <DerekWillemsen@invicta.nl>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Derek Willemsen <DerekWillemsen@invicta.nl>
Subject: Re: Speeding up aggregate
In-Reply-To: A<00b201caab32$58eb98b0$4905fea9@HPA350N>
Content-Type: multipart/alternative;
Hi Mike,
Thanks for your reply. I've tested it and it's still getting slow after
9-10 million records.. The first 9 million were processed in a few
seconds so I was hopeful, but after a while it slowed down and it was
processing a couple of hundred records a second (instead of
thousands/millions).
Greetings,
Derek
________________________________
Van: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] Namens Mike
Palij
Verzonden: donderdag 11 februari 2010 16:53
Aan: SPSSX-L@LISTSERV.UGA.EDU
Onderwerp: Re: Speeding up aggregate
Does it still take that long if you use a file that only has the four
break
variables and perhaps a case ID?
-Mike Palij
New York University
mp26@nyu.edu
----- Original Message -----
From: Derek Willemsen <mailto:DerekWillemsen@invicta.nl>
To: SPSSX-L@LISTSERV.UGA.EDU
Sent: Thursday, February 11, 2010 10:31 AM
Subject: Speeding up aggregate
Dear all,
I have a dataset which contains 16 million records. I need to
count how many records there are on 4 break variables so I use an simple
aggregate with ADDVARIABLES mode.
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=VAR1 VAR2 VAR3 VAR4
/N_BREAK=N.
The first couple of million goes fast, but after 11 million
records the aggregation is getting really slow. It takes ages to finish
the last 5 million records. Normally it takes about 2,5 hours to finish
the operation.
Is there a way speed this process up?
(I have 2GB internal memory)
Thanks in advance!
Derek Willemsen
[text/html]