Date: Wed, 11 Apr 2007 14:37:23 -0500
Reply-To: "Beadle, ViAnn" <viann@spss.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Beadle, ViAnn" <viann@spss.com>
Subject: Re: Match merging data files by a variable with 10 digits
In-Reply-To: A<7.0.1.0.2.20070411131816.038a64a0@mindspring.com>
Content-Type: text/plain; charset="us-ascii"
And fourth, if you are trying to do a table-look up, rather than a file to file match, the file can have duplicate keys but the table cannot.
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Richard Ristow
Sent: Wednesday, April 11, 2007 1:31 PM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: Match merging data files by a variable with 10 digits
Confirming, with comments, advice you've had from others -
At 04:47 AM 4/11/2007, Nai Li wrote:
>I want to match two dataset by a numerical variable named SerialNo.
>This variable has more than 10 digits (e.g. 2333008010, 2333008011
>etc). Therefore, I manually changed data format to F10.2. However,
>when I run the following Command, I get the error message as below. It
>seems that all serialNo have been truncate as 2.33E+08, so even the
>serialno has different last digit, it is still treated as the
>duplicate case.
First, as Art Kendall said, F10.2 won't display a 10-digit integer; it
will display at most a 7-digit integer. The '10' includes the full
width of the field; 3 spaces go for the decimal point and two
post-decimal digits. Use F10, as Art said - or, I might use COMMA13
myself. (Actually, I'd use F11 or COMMA14; I recommend allowing one
more place than you think you need.) Or, if you could have post-decimal
digits, F13.2 (or F14.2).
Second, as John S. Lemon and Gene Maguin have said, you're seeing a
real problem with duplicate keys. SPSS is *not* "truncating as
2.33E+08"; SPSS always matches using the full number, as it's stored
internally. To repeat a often-made but important point: what's printed
is a *display form* of the number. SPSS uses a floating-point binary
representation with 53 bits of precision, and compares the numbers in
that form.
Third, maybe the duplicate problem isn't real, but it has nothing to do
with the display format. If you've made a mistake *reading* the serial
numbers, then SPSS's internal form may be wrong, or incomplete. You'll
to set a format that displays the full number, and see whether you've
done that.
-Good luck,
Richard
|