Date: Thu, 25 Sep 2003 09:34:56 -0500
Reply-To: "Marks, Jim" <Jim.Marks@lodgenet.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Marks, Jim" <Jim.Marks@lodgenet.com>
Subject: Re: should be real simple... counting unique cases
Content-Type: text/plain; charset="iso-8859-1"
James:
You did nothing wrong.
Yes, the syntax from DATA LIST to END DATA creates a new working file. This is done to show the procedure in operation, and as a courtesy to list members who don't have the real data file. (This is usually required to test procedures, anyway.)
Running this portion of the syntax has the same effect as typing in the data, as long as you defined vara as a string variable. When you ran the syntax, you created a new working data file identical to your typing. Using syntax will overwrite the working file, without saving and without prompting you to save!)
The similarity in the values reflects the data you provided. Your string data runs from "1" to "9". The rank of those values matches the string. Notice that vara is not identical to varnbr-- vara is a string, and can't be used in most statistical procedures. ("Varcnt" is identical to varnbr-- the rank of the numbers 1 - 9 is 1 - 9. The decimal points are formatting, not stored values.)
If you run the following-- copy into a syntax window, highlight all and ctrl+r-- you will see the difference:
DATA LIST FREE /vara (A1).
BEGIN DATA
1 1 1 1 2 2 3 3 4 4 5 5 5 6 6 6
7 7 7 7 7 8 9 9 a a a 0 0 0
END DATA.
AUTORECODE vara /INTO varnbr.
DESCRIPTIVES varnbr /STAT=max.
Now there are 11 unique values, and the (string) values of vara do not match the (numeric) values of varnbr.
I included the RANK procedure as separate to your question, but possibly useful information. Since your sample data was labeled a string, but consisted of numbers, I wanted to give you an alternative for that situation. (Actually, AUTORECODE will work on numeric data as well as string data.)
Copy the following 3 lines into a syntax window, and run them on the same dataset:
RANK vara /RANK INTO varcnt1.
RANK varnbr /RANK INTO varcnt2.
RANK varnbr /RANK INTO varcnt3 /TIES=CONDENSE.
Line 1 shows that RANK will not work on string variables.
Line 2 shows the uncondensed ranks, which don't match the values varnbr.
Line 3 shows the condensed ranks (ties get the same value), which match the varnbrs, but not vara.
The on-line syntax guide is an excellent resource for learning syntax. It has clear explanations and examples for most procedures. If you haven't done so, you should it install it with SPSS.
HTH
Jim Marks
Senior Market Analyst
LodgeNet Entertainment Corporation
-----Original Message-----
From: Moffitt, James [mailto:james.moffitt@thomson.com]
Sent: Wednesday, September 24, 2003 9:27 AM
To: Marks, Jim; SPSSX-L@LISTSERV.UGA.EDU
Subject: RE: should be real simple... counting unique cases
Jim:
I know I'll appear to be a real imbecile, but I'm a syntax beginner and I
simply have to swallow my pride and ask the simplest of questions: how,
specifically, does one use the code you posted to test your routine? Does
the section that begins DATA LIST FREE /vara (A1) and ends with END DATA
create a file with a variable named vara containing the 24 records you've
listed or must we create such a file before we attempt to paste your code
into a syntax window and run it? I opened a new SPSS file, created a
variable named vara, entered the appropriate values in the first 24 rows,
opened a new syntax window, and pasted in your code so it appeared like
this:
DATA LIST FREE /vara (A1).
BEGIN DATA
1 1 1 1 2 2 3 3 4 4 5 5 5 6 6 6
7 7 7 7 7 8 9 9
END DATA.
AUTORECODE vara /INTO varnbr.
RANK varnbr /RANK INTO varcnt /TIES=CONDENSE.
DESCRIPTIVES varnbr /STAT=max.
I then place my cursor in the syntax code, pressed ctrl+A and pressed
ctrl+R.
I got 3 columns labeled vara, varbrn, and varcnt. All three contained the
same value for each record with the exception than the value in varcnt was
displayed as the integer followed by a decimal and 3 zeros. What did I do
wrong? Thanks in advance.
-----Original Message-----
From: Marks, Jim [mailto:Jim.Marks@lodgenet.com]
Sent: Wednesday, September 24, 2003 8:16 AM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: should be real simple... counting unique cases
Nico:
There are many ways to skin this cat. I like AUTORECODE:
DATA LIST FREE /vara (A1).
BEGIN DATA
1 1 1 1 2 2 3 3 4 4 5 5 5 6 6 6
7 7 7 7 7 8 9 9
END DATA.
AUTORECODE vara /INTO varnbr.
DESCRIPTIVES varnbr /STAT=max.
The maximum value of varnbr is the number of unique values.
For numeric variables, you can use
RANK varnbr /RANK INTO varcnt /TIES=CONDENSE.
Cheers
Jim Marks
Senior Market Analyst
LodgeNet Entertainment Corporation
-----Original Message-----
From: Nico Peruzzi, Ph.D. [mailto:nperuzzi@yahoo.com]
Sent: Tuesday, September 23, 2003 5:28 PM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: should be real simple... counting unique cases
SPSSers,
I know I'm missing something real simple here, but how does
one count the number of unique cases in a variable as
follows:
String variable A:
1
1
1
2
2
2
3
3
4
4
5
6
7
7
7
7
8
8
9
9
I want to know how many unique numbers are in this
variable.
Thanks in advance, Nico
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com