Date: Thu, 24 Feb 2005 00:29:14 -0500
Reply-To: Richard Ristow <wrristow@mindspring.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Richard Ristow <wrristow@mindspring.com>
Subject: Re: Data Processing
In-Reply-To: <20050223135212.71800.qmail@web86903.mail.ukl.yahoo.com>
Content-Type: text/plain; charset="us-ascii"; format=flowed
At 08:52 AM 2/23/2005, Josh Johnson wrote:
>V1 is a string variable representing telephone numbers. What I need to
>do is:
>a. remove any non-numeric characters.
>then
>b. add zero to any case which its first character is not zero.
>then
>c. remove any cases with a length different than 10 or 11 characters.
>and last
>d. keep only cases which begin with '03' or '04'.
See the following, which is tested; SPSS draft output.
- It uses your revised test data
- V1 is your input, DESIRED is the output from your example; EDITED is
the computed output
- Variables EDIT_A, EDIT_B, EDIT_C and EDIT_D are for debugging; they
have no effect on the computation.
.............
List
Notes
|-------------------------|--------------------|
|Output Created |24 Feb 05 00:23:57 |
|-------------------------|--------------------|
REC V1 DESIRED
1 35#5687/589 0355687589
2 4245896354 04245896354
3 12458/7#4369 .
4 3145215863 03145215863
5 04525687458 04525687458
6 5245896354 .
7 32/4#5875436 03245875436
8 03145215863 03145215863
9 57856987542536 .
10 14452987642536 .
Number of cases read: 10 Number of cases listed: 10
STRING EDITED EDIT_A EDIT_B EDIT_C EDIT_D (A15).
COMPUTE EDITED=V1.
* First, remove leading blanks from the input. (You don't .
* say, but I think this is implied, or makes sense.) .
COMPUTE EDITED = LTRIM(EDITED).
* a. remove any non-numeric characters. .
* This is the pain in the neck, in SPSS. There's no .
* function to search for a character NOT on a list. .
COMPUTE #END_PT = LENGTH(RTRIM(EDITED)).
COMPUTE #INDEX = 1.
LOOP #I = 1 TO 15.
- DO IF INDEX ('0123456789',SUBSTR(EDITED,#INDEX,1)) = 0.
. COMPUTE SUBSTR(EDITED,#INDEX) = SUBSTR(EDITED,#INDEX+1).
. COMPUTE #END_PT = #END_PT - 1.
- ELSE.
. COMPUTE #INDEX = #INDEX + 1.
- END IF.
END LOOP IF #INDEX GT #END_PT.
COMPUTE EDIT_A = EDITED.
* b. add zero to any case which its first character is not .
* zero. .
IF (SUBSTR(EDITED,1,1) NE '0') EDITED = CONCAT('0',EDITED).
COMPUTE EDIT_B = EDITED.
* c. remove any cases with a length different than 10 or 11.
* characters. .
IF (NOT ANY(LENGTH(RTRIM(EDITED)),10,11)) EDITED = '.'.
COMPUTE EDIT_C = EDITED.
* d. keep only cases which begin with '03' or '04'" .
IF (NOT ANY(SUBSTR(EDITED,1,2),'03','04')) EDITED = '.'.
COMPUTE EDIT_D = EDITED.
LIST REC V1 DESIRED EDITED.
List
Notes
|-------------------------|--------------------|
|Output Created |24 Feb 05 00:23:59 |
|-------------------------|--------------------|
REC V1 DESIRED EDITED
1 35#5687/589 0355687589 0355687589
2 4245896354 04245896354 04245896354
3 12458/7#4369 . .
4 3145215863 03145215863 03145215863
5 04525687458 04525687458 04525687458
6 5245896354 . .
7 32/4#5875436 03245875436 03245875436
8 03145215863 03145215863 03145215863
9 57856987542536 . .
10 14452987642536 . .
Number of cases read: 10 Number of cases listed: 10