LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2009, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Mon, 30 Nov 2009 21:47:47 -0400
Reply-To:   Muthia Kachirayan <muthia.kachirayan@GMAIL.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Muthia Kachirayan <muthia.kachirayan@GMAIL.COM>
Subject:   Re: Hashing with SAS9.2 using multiple hash keys and a date range (Revisited)
In-Reply-To:   <8633cace-5d2b-43e3-9da8-7840df842060@u8g2000prd.googlegroups.com>
Content-Type:   text/plain; charset=ISO-8859-1

Rob,

You said that you have tried with Proc Format approach. There were also hash solutions by Hash Gurus when you have posted last. Both of their solutions will work even now for you under 9.1. I give one another hash solution which you can try before you get 9.2.

*** For 9.1 ***;

data need; if _n_ = 1 then do; if 0 then set clientDiscount; declare hash h(); h.definekey('id','clientid'); h.definedata('fromdate','todate','clientDiscount') ; h.definedone(); do until(z); set clientDiscount end = z; if todate = . then todate = 1e6; do id = 1 by 1 until(h.check() ne 0); end; h.add(); end; end; set salesHistory ; do id = 1 by 1 while(h.find() = 0 and not _in_range); _in_range = fromdate <= dateofsale <= todate; if _in_range then output; end; if not _in_range then do; clientDiscount = .; output; end; drop _:; run;

You have already received a solution for 9.2. I give another version which declares and instantiates the hash object once only.

*** For 9.2 ***;

data need; if _n_ = 1 then do; if 0 then set clientDiscount; declare hash h(multidata:'y'); h.definekey('clientid'); h.definedata('fromdate','todate','clientDiscount') ; h.definedone(); do until(z); set clientDiscount end = z; if todate = . then todate = 1e6; h.add(); end; end; set salesHistory ; _rc = h.find(); _in_range = fromdate <= dateofsale <= todate; h.has_next(result:_r); do while(_r ne 0 and not _in_range); h.find_next(); _in_range = fromdate <= dateofsale <= todate; h.has_next(result:_r) ; end; if not _in_range then clientDiscount = .; drop _:; run;

Best regards,

Muthia Kachirayan

On 11/29/09, rob <rob.ashmore@mbf.com.au> wrote: > About 6 months ago I posted a message to this group enquiring about > hashing with a date range. > At the time we were running on SAS9.1.3. We ended up building a > temporary solution using one of the suggestions made by one of the > SAS_L repondents. The solution worked well and involved building a > format. > > We are now running on SAS9.2. We are embarking on some work which > means that the proc format > approach we employed is no longer suitable. > > Does anyone have any examples of how we can build a hash with > duplicate key values? The correct key to use being determined by two > date columns which provide the range. > > My origional posting (below) gives an example of the task. > > regards, > > rob ashmore > > > > > ****ORIGINAL POST START******* > Dear all, > > I wish to perform a many to many merge using a date range. > I could do this several ways using proc sort / data step or > perhaps even sql. > > > However, due to its large size, I dont want to sort the base dataset. > As for the sql possibilites, I am not too sure about them. > > > I have supplied some example sas code below to illustrate what I am > trying to do. > > > I have two datasets saleshistory (50+ million rows) and > clientDiscountDtls (10,000 rows). > Basically, the task is to add the column clientDiscount to every row > of the salesHistory > dataset. The common key is clientId. However, the clientDiscount > should only be populated > if the dateOfSale falls between the fromDate and toDate of the > clientDiscountDtls dataset. > If there is no corresponding date range, the value of clientDiscount > is set to missing. > If there is no matching clientId in the clientDiscountDtls table then > the record is still > outputted to the results dataset but the the value of clientDiscount > is set to missing. > > > I am hoping that the above can be achieved using a hash because I > need > it to run pretty > quickly and repeatedly. > > > I have done a little research and have found that hashing using SAS > 9.1 will probably not be > useful for doing a many to many merge. However, we will soon be > moving > to SAS 9.2 and > I believe there are some new features in the hash data step component > object which may > allow what I am trying to do. > > > Could anyone help by showing me how I can achieve this using a hash, > either 9.1 or 9.2. > Also, in the interests of my own education, Id be keen to see any > example of how to do this using > proc sql. I suspect it may well be quite straightforward? > > > *Sample Datsets*; > > > data work.salesHistory; > infile cards ; > length clientId $4 > productCode $2 > dateOfSale 8 > fullAmtPayable 8 > ;;; > > > input clientId $ > productCode $ > dateOfSale date9. > fullAmtPayable ; > cards; > C001 DX 22jan2001 100 > C002 KR 22mar2006 230 > C001 EZ 20sep2008 400 > C001 DX 25may2004 250 > C003 EK 15nov2008 300 > C002 MG 10feb2003 50 > ; > run; > > > data work.clientDiscount; > format fromdate todate date9.; > infile cards ; > length clientId $4 > fromDate 8 > toDate 8 > clientDiscount 8 > ;;; > > > input @1 clientId $ > @6 fromDate date7. > @15 toDate date7. > clientDiscount 8.2 > ; > cards; > C001 26dec00 01may04 0.1 > C001 01aug04 15oct07 0.25 > C001 16oct07 . 0.05 > C002 24feb03 01jan09 0.075 > ; > run; > > > **Desired Result Dataset**; > clientid / productcode / dateOfSale / fullAmtPayable / clientDiscount > C001 DX 22jan2001 100 0.10 > C002 KR 22mar2006 230 0.075 > C001 EZ 20sep2008 400 0.05 > C001 DX 25may2004 250 . > C003 EK 15nov2008 300 . > C002 MG 10apr2003 50 . > > > etc etc > > > Thanks In Advance, > > > Rob Ashmore > > ****ORIGINAL POST END******* >


Back to: Top of message | Previous page | Main SAS-L page