Date: Wed, 29 Aug 2001 09:55:12 -0400
Reply-To: Brad.Goldman@AUTOTRADER.COM
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Brad Goldman <Brad.Goldman@AUTOTRADER.COM>
Subject: tcpip parsing solved
Content-Type: text/plain; charset="iso-8859-1"
I got several tips and solutions, from Karsten Self, Mike Rhoads, Bas
Pruijn, and David Ward. I chose to use Mike Rhoad's approach. There are
bells and whistles and optimization I need to do, but the basic task has
been accomplished. Note Mike's use of the multiple set statement with
firstobs=2 to compare the "current" observation with the "next" observation,
I thought that was particularly neat.
From Mike's emails to me:
I would start by sorting by LoTcpIp, and within that by descending
HiTcpIp. After that I think it's just a matter of a DATA step based on a
"look-ahead" read to the next record, where your decision algorithm is
something like:
Hold on to the current record if and only if LoTcpIp from the next record is
greater than the HiTcpIp from the current record.
"Merging" a SAS data set with a copy of itself: Use 2 SET statements,
but with FIRSTOBS = 2 and with all variables either dropped or renamed (e.g.
NextVal) on one of them. (He included sample code which I barely modified.)
Thanks again to all who helped,
Brad Goldman
**************************;
*Convert tcpips to a decimal form with leading zeros;
*Should eventually be hexadecimal;
data convert (drop=one two three four);
set tcp.whois;
one=translate(right(input(scan(beg,1,'.'),$3.)),'0',' ');
two=translate(right(input(scan(beg,2,'.'),$3.)),'0',' ');
three=translate(right(input(scan(beg,3,'.'),$3.)),'0',' ');
four=translate(right(input(scan(beg,4,'.'),$3.)),'0',' ');
begconv=input((one||two||three||four),12.);
one=translate(right(input(scan(end,1,'.'),$3.)),'0',' ');
two=translate(right(input(scan(end,2,'.'),$3.)),'0',' ');
three=translate(right(input(scan(end,3,'.'),$3.)),'0',' ');
four=translate(right(input(scan(end,4,'.'),$3.)),'0',' ');
endconv=input((one||two||three||four),12.);
run;
proc sort data=convert;
by begconv descending endconv;
run;
data Final;
set convert;
if not Eof2 then
set convert
(firstobs=2
keep=begconv endconv
rename=(begconv=Nextbegconv endconv=Nextendconv))
end=Eof2;
length status $ 8;
* Always keep last record;
if (Nextbegconv > endconv) or Eof2
then status = 'KEEP';
else status = 'DISCARD';
run;
|