LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2010)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 13 Sep 2010 17:23:31 -0400
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      Re: Merge problems
Comments: To: Mike Pritchard <5circles@gmail.com>
In-Reply-To:  <040101cb5352$0614ef10$123ecd30$@com>
Content-Type: text/html; charset="us-ascii"

<html> <body> At 10:43 AM 9/13/2010, Mike Pritchard wrote:<br><br> <blockquote type=cite class=cite cite="">There is a subset of variables in the latest working file (that has been modified through coding/recoding and labeling) that is also in the other file.&nbsp; The other file is an earlier version with a<br> few additional variables that were dropped inadvertently from the working file when some operations - primarily SAVE with a different order - were done.&nbsp; So I needed to recover these variables.</blockquote><br> To start with (and it's not what you asked), you have no 'BY' clause in either of you <tt><font size=2>MATCH FILES</font></tt> commands. From the <i>Command Syntax Reference</i>,<br><br> <blockquote type=cite class=cite cite=""><font size=1>.. </font><font face="Courier New, Courier" size=1>If </font><font face="Courier, Courier" size=1>BY </font><font face="Courier New, Courier" size=1>is not used, the program performs a parallel (sequential) match, combining the first case from each file, then the second case from each file, and so on, without regard to any identifying values that may be present.</font></blockquote><br> So, one extra case, or one missing one, from either file, and your result can have values for some cases that belong with other cases altogether. Do you have any set of variables that can form a record key within your files? If so, use them.<br> &nbsp;<br> But as to what you asked about, your syntax<br><br> <tt><font size=2>MATCH FILES <br> &nbsp;&nbsp; /FILE=*<br> &nbsp;&nbsp; /FILE='DataSet10'.<br><br> </font></tt>works because (<i>CSR</i> again)<br> <blockquote type=cite class=cite cite=""> <font face="Courier New, Courier" size=1>If the same variable name is used in more than one input file, data are taken from the file specified first. Dictionary information is taken from the first file containing value labels, missing values, or a variable label for the common variable. If the first file has no such information, </font><font face="Courier, Courier" size=1>MATCH FILES </font><font face="Courier New, Courier" size=1>checks the second file, and so on, seeking dictionary information.</font></blockquote><br> So, for all the variables that appear in both files, you get the value from the active file ('<tt><font size=2>FILE=*</font></tt>'). Fine, if that's what you want, but make sure it <i>is</i> what you want.<br><br> (And, by the way, this syntax will blow up if any variables from the two files have the same name but are type-incompatible: that is, one numeric and one string, or two strings of different lengths. But your files don't have that problem.)<br><br> Now, as you write, the GUI generates syntax,<br> <blockquote type=cite class=cite cite=""><tt><font size=2>MATCH FILES /FILE=*<br> &nbsp; /FILE='DataSet10'<br> &nbsp; /RENAME (var1 var2 ... var1442 = d0 d1 ... d1442)<br> &nbsp; /DROP = d0 d1 ... d1442.</font></tt></blockquote>That's because the GUI's code-generating logic takes the premise that all variables (except key variables) are actually different between the files, and if any do have the same name, it's a conflict. So the GUI generates this awkward code to get rid of all variables in <tt><font size=2>DataSet10</font></tt> that also occur in the active file.<br><br> <blockquote type=cite class=cite cite="">If I run the merge from the GUI I get a bunch of errors about temporary variables. The errors are all about undefined variable names.</blockquote><br> You'd have to give us a few examples of what variable names are 'undefined'. 1,442 variables is a very long RENAME list, but there's no documented limit of the number of variables to RENAME. It looks like there could be 1,442 source variables and 1,44<u>3</u> target variables; might that be true? Although the GUI's code-generator should be smart enough not to let that happen.<br><br> Anyway, go ahead and use your simple syntax. However, if I were doing this, I'd load the old file, keeping only key variables and the 'lost' variables I wanted to recover; sort both files by the set of key variables;&nbsp; and use something like (untested),<br><br> <tt><font size=2>MATCH FILES<br> &nbsp; /FILE=&lt;newfile&gt;<br> &nbsp; /FILE=&lt;oldfile&gt;<br> &nbsp; /BY&nbsp;&nbsp; &lt;keyvars&gt;<br><br> </font></tt><blockquote type=cite class=cite cite="">The other file is an earlier version with a few variables that were dropped inadvertently file when some operations - primarily SAVE with a different order - were done.</blockquote><br> One moral: always end a KEEP list with the keyword 'ALL', unless you're trying to drop some variables. That way, any variables you forget to name will still be there, at the end of the file.</body> <br> </html>

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page