Date: Wed, 1 Feb 2006 15:31:35 -0800
Reply-To: Yifan Lu <ylu@ibiweb.org>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Yifan Lu <ylu@ibiweb.org>
Subject: count problem
In-Reply-To: <7.0.0.16.2.20060201160307.024d4810@mindspring.com>
Content-Type: text/plain; charset="us-ascii"
Hi,
I have a very large time series data file that has 20 millions of records.
There are some variables that I am interested: PERSON, COMPANY and ACTIVITY.
Since it is a time series data, one PERSON may have several records (rows)
that indicate different time. All three variables are string variables. The
data set structure looks like this:
COMPANY PERSON ACTIVITY TIME
1 01 n1 2000
1 01 n1 2001
1 02 n1 2001
2 03 m1 2000
2 03 m2 2001
2 03 m2 2002
2 04 m1 2000
3 05 n1 2000
....................................
I would like to know how many PERSON by COMPANY level and how many ACTIVITY
by PERSON level. Is there any easy way to do it?
Any suggestion is appreciated!
Yifan
|