Date: Sat, 10 Dec 2005 00:11:16 -0500
Reply-To: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Richard A. DeVenezia" <rdevenezia@WILDBLUE.NET>
Subject: Re: Tracking fund/function/object codes...
oseithedude@gmail.com wrote:
> Hello! I have some SAS code that codes various fund/function/object
> combinations from a financial report (usually this is a text file
> containing thousands of lines with each line containing the following:
> an account group "id" number, a "fund" number, "function" number,
> "object" number, and corresponding "amount" number) to an alphanumeric
> code. Here is an example of some of the code (please excuse if there
> are errors in the code - I am just typing off of memory):
>
> if ((fund = 300) and (4000 <= function and function <= 5000) and (200
> <= object and object <= 300) ) then A26 + amount;
>
> if ((fund = 640) and (function = 2000) and (100 <= object and object
> <=300)) then B37 + amount;
>
> if ((fund =200) and function in (3000,6000) and (200 <= object and
> object <= 400)) then
> P84 + amount;
>
> The resulting dataset had the id number and all of the possible
> alphanumeric codes as variables and outputs each individual id, along
> with the cumultative value of each alphanumeric code for each id as
> follows:
>
> id A26 B37 P84
> --- ------ ------- ------
> 111 5,426 634 453
> 222 498 24 7,211
> 333 1,487 2,365 2,990
> 444 8.932 9,298 3,734
> 555 3,923 12.35 1,934
>
> That is pretty much what the output dataset looks like (except that
> there are a lot more id numbers and a lot more alphanumeric codes).
> What I wanted to do is, from this program, also create a dataset that
> contains each possible fund/function/object combination and a variable
> called "code" containing the alphanumeric code that the amount that
> particular fund/function/object code had was allocated to. Below is a
> partial sample, of what I'd need the output dataset to look like
> (there would, of course, be thousands of unique fund/function/object
> combinations but below is a listing of four of them).
>
>
> fund function object code
> ------ ----------- --------- --------
> 300 4500 210 A26
> 300 4100 280 A26
> 200 6000 400 P84
> 640 2000 110 B37
>
> Would it be possible to somehow do this? Thanks again for any
> advice!!!
>
The samples of your code are in disagreement with the control data table you
present.
In the sample code you used ranges with function and object, in the control
table you use explicit points. Are the ranges in the sample code simply a
convience for not having to code each combination stated in the control
table ? If not, the control table does not contain enough information to
succinctly encapsulate your rules for 'code' summation (I.e., If the ranges
truly represent the way rules are applied, then either the control table
should contain the range endpoints, or remain the way it is and enumerate
[e.g. have many more rows] for every possible set of values comprising the
ranges.
Presuming the control is sufficient as is, and the summation rules are for
each specific fund/function/object 'points' that might occur or need code
summation.
Furthermore, the ID variable in the final sample table suggests a
'tabulation' or 'transposition' of data that is originally categorical.
This sample code demonstrates some very typical processing of well designed
data structures.
-------------------------------
%let seed = 20051231;
data rules;
input fund function object code $;
datalines4;
300 4500 210 A26
300 4100 280 A26
200 6000 400 P84
640 2000 110 B37
;;;;
run;
* fake data;
* 80% chance a txid definitely corresponds to a rule;
data transactions;
do txid = 1 to 100000;
clientid = floor (1000*ranuni (&seed));
if ranuni(&seed) < 0.80 then do;
* 80% of transactions randomly match a rule;
p = 1 + nrules*ranuni(&seed);
set rules nobs=nrules point=p;
end;
else do;
* 20% random, unlikely to match a rule;
fund = floor (1E3*ranuni(&seed));
function = floor (1E4*ranuni(&seed));
object = floor (1E3*ranuni(&seed));
end;
amount = floor (abs (1000*rannor (&seed)));
output;
end;
STOP;
keep txid clientid fund function object amount;
run;
data transactionsWithCodesAssigned;
* prep program data vector (PDV);
if 0 then set rules(obs=0);
* Use hash for three var lookup key corresponding to a code;
* i.e. code = f(fund,function,object);
declare hash rules (dataset:'rules');
rules.defineKey ('fund', 'function', 'object');
rules.defineData ('code');
rules.defineDone ();
do until (lastTransaction);
set transactions end=lastTransaction;
if rules.find() ne 0 then code='???';
OUTPUT;
end;
STOP;
run;
ods html
file="%sysfunc(pathname(WORK))/report.html"
style=sasweb
;
* create a tabulation report and capture the data;
* the output data can be transposed if absolutely necessary, but that should
not be the case by now :) ;
proc tabulate
data=transactionsWithCodesAssigned
out=transactionsSummaryByClient
(keep=clientid code amount_:)
;
class clientid code;
var amount;
table clientid, code*amount*(SUM*format=comma12. N);
run;
ods html close;
--
Richard A. DeVenezia
http://www.devenezia.com/