LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2009, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 11 May 2009 21:03:31 +0100
Reply-To:     karma <dorjetarap@GOOGLEMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         karma <dorjetarap@GOOGLEMAIL.COM>
Subject:      Re: Report mining techniques question.
Comments: To: Stephen Dybas <skd02@health.state.ny.us>
In-Reply-To:  <200905111842.n4BAmtFW005047@malibu.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1

Maybe slightly crude but it works for this case. I had to cheat slightly and ensure there were at least 2 spaces between the address and the dob on each line as the address contains spaces we want to keep.

filename test "c:\documents and settings\kdt\desktop\test.txt" ;

data want (drop=_:); infile test missover ; input _start $ ; if _start eq "Page" then do ; place = scan(_infile_,5) ; input / ; do _n_=1 to 5 ; input (County LName FName)($) SSN CIN $8. Address & :$13. dob ; if _n_ ne 3 then output ; end ; end ; run ; proc print ;run ;

output:

Obs place County LName FName SSN CIN Address dob

1 Albany 01 Davis Richard 98120987 M639472 1 Main St. 19650103 2 Albany 01 Davis Bill 98129987 M639518 1 Mane St 19650103 3 Albany 01 Spade Sam 157829049 M937272 5 Short St. 19860328 4 Albany 41 Spade Ace 157829149 M937284 5 Shorter St. 19860328 5 Allegany 03 Time Justin 848409393 M848383 3 Times Pl. 19760228 6 Allegany 03 Time Justin 848409393 M848283 3 Times Place 19790228 7 Allegany 03 Smith Bill 85737827 M772723 4 Forty Ave 19840704 8 Allegany 03 Smyth William 85737827 M772721 4 Fort Ave 19840704

2009/5/11 Stephen Dybas <skd02@health.state.ny.us>: > Hello SAS-Ls, > > I hope every had a nice weekend, especially mothers! > > I am including a small mock up report to show what I am trying to > accomplish. Thanks to everyone that has already replied with some leads on > using the _infile_ automatic variable. > > A couple of other things that I need help with include saving the county > name for inclusion as part of an output statement along with the data that > follows in the report. I would like to read in the two record pairs that > follow the county, as #1 and #2 perhaps, as they really belong together in > one observation with different variable names. > > I am hoping if anyone can describe an approach to tackling this input > problem. > > This is the beginning of the report > that I have to discard because it is just > a bunch a title statements that do not contain > and usable data. The data lines appear below > All the data is fictional > The actual data will start next > > Page 1 for Albany Albany > County LName FName SSN CIN Address DOB > > 01 Davis Richard 098120987 M639472 1 Main St. 19650103 > 01 Davis Bill 098129987 M639518 1 Mane St 19650103 > > 01 Spade Sam 157829049 M937272 5 Short St. 19860328 > 41 Spade Ace 157829149 M937284 5 Shorter St. 19860328 > > Page 1 for Allegany Allegany > County LName FName SSN CIN Address DOB > > 03 Time Justin 848409393 M848383 3 Times Pl. 19760228 > 03 Time Justin 848409393 M848283 3 Times Place 19790228 > > 03 Smith Bill 085737827 M772723 4 Forty Ave 19840704 > 03 Smyth William 085737827 M772721 4 Fort Ave 19840704 > > End of the report > I need to mine the county data and what > appears to look like the records of the report > > I am getting back into SAS so my skills are a little rusty. Before this, I > always worked with flat input files, never parsing report files. >


Back to: Top of message | Previous page | Main SAS-L page