LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2010)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 14 Sep 2010 10:47:35 -0400
Reply-To:     Art@DrKendall.org
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Art Kendall <Art@DrKendall.org>
Organization: Social Research Consultants
Subject:      Re: Problem Reading File
In-Reply-To:  <201009121316.o8CAnI3w030609@willow.cc.uga.edu>
Content-type: text/html; charset=ISO-8859-1

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> If you want to change all double quote to blank<br> think about something like this.<br> read the line as a long string.<br> use the REPLACE function.<br> write the new string to a .txt file.<br> read the new .txt file as pipe delimited.<br> <blockquote type="cite"><!--File syn_transformation_expressions_string_functions.htm--><!--Generated with Help Tools v. 4.5.2x --> <p class="body"><span class="glossary"><span class="glossary-term"><font style="background-color: rgb(51, 153, 255);" color="#ffffff">REPLACE</font>. </span><span class="glossary-definition"><font style="background-color: rgb(51, 153, 255);" color="#ffffff">REPLACE</font>(a1, a2, a3[, a4]). String. In a1, instances of a2 are <font style="background-color: rgb(51, 153, 255);" color="#ffffff">replaced</font> with a3. The optional argument a4 specifies the number of occurrences to <font style="background-color: rgb(51, 153, 255);" color="#ffffff">replace</font>; if a4 is omitted, all occurrences are <font style="background-color: rgb(51, 153, 255);" color="#ffffff">replaced</font>. Arguments a1, a2, and a3 must resolve to string values (literal strings enclosed in quotes or string variables), and the optional argument a4 must resolve to a non-negative integer. For example, <font style="background-color: rgb(51, 153, 255);" color="#ffffff">REPLACE</font>("abcabc", "a", "x") returns a value of "xbcxbc" and <font style="background-color: rgb(51, 153, 255);" color="#ffffff">REPLACE</font>("abcabc", "a", "x", 1) returns a value of "xbcabc".</span></span> </p> </blockquote> <br> <br> Art Kendall<br> Social Research Consultants<br> <br> On 9/12/2010 9:16 AM, David Marso wrote: <blockquote cite="mid:201009121316.o8CAnI3w030609@willow.cc.uga.edu" type="cite"> <pre wrap="">On Fri, 10 Sep 2010 08:56:33 -0400, Marcos Sanches <a class="moz-txt-link-rfc2396E" href="mailto:msanches35@gmail.com">&lt;msanches35@gmail.com&gt;</a> wrote:

</pre> <blockquote type="cite"> <pre wrap="">Hello all,

I am using the code below to read a pipe delimited file. and here you have an example of a data line that is not being read correctly.

"XXX"|"MARTIN DARVIN "DAN" DXXXX-2248"|"0"|"1"|"Client 2"|"GG"...

SPSS wont read the whole piece "MARLY DARVIN "DAN" DXXXX-9778" in a single variable, as it should. I will instead consider the double quota in the middle of the string as a delimiter and split the string there and mess up everything further in the line.

My questions -

Is there a way to fix this so that SPSS will only cut the string at the pipes? Is this a problem with SPSS or is it a problem with the data file that should not have double quotas other then the qualifiers?

Note - I considered replacing double quotes by a blank and the file would no longer have qualifiers, which I think would solve the problem, but the file is huge and this would be a time consuming task.

Follow the reading syntax:

GET DATA /TYPE = TXT /FILE = 'P:\MRCE\01 Teams\Amy_Charles\GXI\Data\Preliminary\client2\wn_extract_enhanced.csv' /DELCASE = LINE /DELIMITERS = "|" /QUALIFIER = '"' /ARRANGEMENT = DELIMITED /FIRSTCASE = 2 /IMPORTCASE = ALL /VARIABLES = X1 A37 X2 F4.2 ... ... Thanks a lot!

Marcos

</pre> </blockquote> <pre wrap="">Here's one way to skin the cat. You'll want to nuke the BEGIN DATA... END DATA and read external file. You'll have to modify the string variable lengths later. You will also want to alter the vector length from 6 to your variable count and the A40 to your longest embedded string. SUBSTR may need to be changed to CHAR.SUBSTR -or maybe not-???

* Another General Parser *. * NON PiThong version ;-) DATA LIST / X 1-255 (A). BEGIN DATA "XXX"|"MARTIN DARVIN "DAN" DXXXX-2248"|"0"|"1"|"Client 2"|"GG" END DATA.

VECTOR PARSED(6, A40). COMPUTE #0=0. LOOP. COMPUTE #1=INDEX(X,'|'). COMPUTE #0=#0+1. IF #1&gt;0 PARSED(#0)=SUBSTR(X,1,#1-1). COMPUTE X=SUBSTR(X,#1+1). END LOOP IF #1=0. COMPUTE PARSED(#0)=X. MATCH FILES / FILE * / DROP X.

LIST. PARSED1: "XXX" PARSED2: "MARTIN DARVIN "DAN" DXXXX-2248" PARSED3: "0" PARSED4: "1" PARSED5: "Client 2" PARSED6: "GG"

Number of cases read: 1 Number of cases listed: 1

===================== To manage your subscription to SPSSX-L, send a message to <a class="moz-txt-link-abbreviated" href="mailto:LISTSERV@LISTSERV.UGA.EDU">LISTSERV@LISTSERV.UGA.EDU</a> (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

</pre> </blockquote> </body> </html>

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page