LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2003)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 13 May 2003 16:39:06 -0400
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      Re: Altering data files while importing them
Comments: To: lars.h.johansson@vattenfall.com
In-Reply-To:  <665B6A968E01DE4F9118A157395AA5663C5234@mxmb01.eur.corp.vat
              tenfall.com>
Content-Type: text/plain; charset="us-ascii"; format=flowed

At 09:39 AM 5/13/2003 +0200, L Johansson wrote:

>I wonder if it is possible to write a script that will take care of >the entire process of opening a new data file and formatting it. I.e., >what I'd like to achieve is for SPSS to 'automatically' give variables >with certain names a >pre-defined type and length. For instance, the variable 'Sex' should >always be imported as string accepting six characters. > >However, the data files in tab separated ASCII format that I have to >import are far from identical. The variables they contain vary from >case to case [i.e., it seems, from file to file, not "case to case" in >the SPSS sense]. [I want a script to] define and alter variables that >exist in each individual file and ignore instructions that concern >variables that are non-existent. > >I hope what I've written makes sense.

I've done things like this myself, but for import from ACCESS rather than from tab-delimited files.

What you're trying to alter is the *DICTIONARY* information in the data -- the properties of the variables, like length and format -- rather than the DATA. (For the latter, you'd add transformation commands after the import, as I'm sure you know.)

I'm not a "scripter" myself, so I'm not sure what tools scripting offers; but there are other ways. In general, you

a.) Read the "dictionary" information you have into an SPSS file, or some other database system. In this file, each variable in your data becomes an SPSS 'case', and the attributes -- name, variable label, format, etc. -- become SPSS variables.

In your case, it sounds like the variable names are in the files, maybe as the first rows, and can be read in.

b.) Compute any attributes you can, that you don't have from the file. In your case, for example, if the variable is named "SEX", you can assign for format A6.

c.) From this file, write SPSS code to declare the variables. In general, each variable should have . A NUMERIC or STRING statement, giving its type and format; this also defines the length of string variables . A VARIABLE LABEL statement, if meaningful . Possibly, VARIABLE WIDTH and ALIGNMENT statements.

For example, my own code writes a lot of variable definitions like this:

VARIABLE LABELS COUNTY "FN - COUNTY CODE". - FORMATS COUNTY (A2). - VARIABLE WIDTH COUNTY (5). - VARIABLE ALIGNMENT COUNTY (RIGHT).

(Note that there's NOT a "STRING" statement. In my case, the data type is set by the ODBC code that reads the variables from ACCESS.)

It's not child's play, but you can write SPSS syntax that will read the dictionary, write the SPSS code for it, and write the DATA LIST command to read the data, and run that against each file you have. I'm not sure there's anything in scripting that would make it easier.


Back to: Top of message | Previous page | Main SPSSX-L page