Date: Thu, 23 Aug 2007 08:47:28 -0400
Reply-To: Richard Reeves <reeves@STUDENTCLEARINGHOUSE.ORG>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Richard Reeves <reeves@STUDENTCLEARINGHOUSE.ORG>
Subject: Re: which to use, Proc SQL or Proc Compare
Content-Type: text/plain; charset="US-ASCII"
Hi Johanna,
I don't know how big your data are (rows) but if processing isn't an
issue you could merge the base and week tables on id and another
variable and create an indicator if the field changed and run this 20
times. In the example below you would merge on id and hours if a match
exists then 1 otherwise 0 then merge on id and code if a match exists 1
otherwise 0. You can create a table of id with the indicators and use
that for change analysis. If you are in a controlled environment work
with the buffersize and _method commands to understand how the merge is
working.
I can't really think of a better solution but there are better people on
this list than me. But if you have less than 1 million records it
probably is a reasonable option.
Rich
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@listserv.vt.edu] On Behalf Of
Johanna
Sent: Wednesday, August 22, 2007 3:17 PM
To: SAS-L@LISTSERV.VT.EDU
Subject: which to use, Proc SQL or Proc Compare
below is a small sample of a process that runs a comparison to detect
changes to some data, every week.
the question I had was... which is better to track changes, Proc Sql or
Proc
Compare, when you consider that I'd like to know what part of the data
changed
below, Mary Smith's hours changed and Bob Murdoch is a new person but
Proc
Compare won't tell me why the person is listed, just that something
changed
in the real world, I have a dataset with about 20 variables and any one
of
them could change week over week, without having to analyze each
separately,
is there an elegant way to detect "What's new this week?"
the only condition I don't need to track is if the person does not exist
the
next week, ie. dropped, but everything else I would need to be aware of,
ie.
"Mary Smith's HOURS changed" and "Bob Murdoch is a NEW item", also the
code
could have changed but not in my sample below.
data base;input id hours code $3. name $20.;
cards;
001 40 ABB Tom Jones
002 20 DBF Mary Smith
;
run;
data week;input id hours code $3. name $20.;
cards;
001 40 ABB Tom Jones
002 40 DBF Mary Smith
003 40 ABB Bob Murdoch
;
run;
proc compare base=base compare=week out=result(where=(_type_="COMPARE"))
outnoequal outbase outcomp outdif noprint;id id;
run;
data _null_;set ;put(_all_)(=);run;