|
On Fri, 10 Sep 2010 08:56:33 -0400, Marcos Sanches <msanches35@gmail.com> wrote:
>Hello all,
>
>I am using the code below to read a pipe delimited file. and here you have
>an example of a data line that is not being read correctly.
>
>"XXX"|"MARTIN DARVIN "DAN" DXXXX-2248"|"0"|"1"|"Client 2"|"GG"...
>
>SPSS wont read the whole piece "MARLY DARVIN "DAN" DXXXX-9778" in a single
>variable, as it should. I will instead consider the double quota in the
>middle of the string as a delimiter and split the string there and mess up
>everything further in the line.
>
>My questions -
>
>Is there a way to fix this so that SPSS will only cut the string at the
>pipes?
>Is this a problem with SPSS or is it a problem with the data file that
>should not have double quotas other then the qualifiers?
>
>Note - I considered replacing double quotes by a blank and the file would no
>longer have qualifiers, which I think would solve the problem, but the file
>is huge and this would be a time consuming task.
>
>Follow the reading syntax:
>
>GET DATA /TYPE = TXT
> /FILE = 'P:\MRCE\01
>Teams\Amy_Charles\GXI\Data\Preliminary\client2\wn_extract_enhanced.csv'
> /DELCASE = LINE
> /DELIMITERS = "|"
> /QUALIFIER = '"'
> /ARRANGEMENT = DELIMITED
> /FIRSTCASE = 2
> /IMPORTCASE = ALL
> /VARIABLES =
> X1 A37
>X2 F4.2
>...
>...
>Thanks a lot!
>
>Marcos
>
Here's one way to skin the cat.
You'll want to nuke the BEGIN DATA... END DATA and read external file.
You'll have to modify the string variable lengths later.
You will also want to alter the vector length from 6 to your variable count
and the A40 to your longest embedded string.
SUBSTR may need to be changed to CHAR.SUBSTR -or maybe not-???
* Another General Parser *.
* NON PiThong version ;-)
DATA LIST / X 1-255 (A).
BEGIN DATA
"XXX"|"MARTIN DARVIN "DAN" DXXXX-2248"|"0"|"1"|"Client 2"|"GG"
END DATA.
VECTOR PARSED(6, A40).
COMPUTE #0=0.
LOOP.
COMPUTE #1=INDEX(X,'|').
COMPUTE #0=#0+1.
IF #1>0 PARSED(#0)=SUBSTR(X,1,#1-1).
COMPUTE X=SUBSTR(X,#1+1).
END LOOP IF #1=0.
COMPUTE PARSED(#0)=X.
MATCH FILES / FILE * / DROP X.
LIST.
PARSED1: "XXX"
PARSED2: "MARTIN DARVIN "DAN" DXXXX-2248"
PARSED3: "0"
PARSED4: "1"
PARSED5: "Client 2"
PARSED6: "GG"
Number of cases read: 1 Number of cases listed: 1
=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|