|
Get a question about using INPUT PROGRAM to create a subset of one file in
which the ID of the rows appears only in another file. Yes, I know proc
MATCH FILE and TABLES subcommand can do it, but deal to some restrictions, I
need to do it via INPUT PROGRAM ... DATA LIST .... Any hints are
appreciated. Thanks in advance.
Here is my question:
--------------------
I have 2 text files, one is a list of ID that I will use in the analysis,
another is a transaction file for my entire database. Both files are sorted
by ID already. The text files look like the followings (original files
extremely big, the followings are just to illustrate the idea):
ID_list.txt
--------------
ID
001
002
005
Transaction.txt
---------------------
ID DollAmt
001 14
001 20
001 15
002 20
002 15
003 40
004 10
005 17
005 17
006 18
What I want to create using INPUT PROGRAM and END INPUT:
---------------------------------------------------------------------------------------------
ID DollAmt
001 14
001 20
001 15
002 20
002 15
005 17
005 17
What I want to do is to create a subset of the Transaction.txt which
contains the rows with the IDs appears only in the ID_list.txt. However I
would like to do it such that none of the files needed to be read and saved
as SPSS format data files first (because of memory and space issues). That
is, I would like to use INPUT PROGRAM and END INPUT PROGRAM etc to do the
task instead of the MATCH FILES … TABLE … procedure. Could you please shed
me some light? Many Thanks in advance.
|