```Date: Mon, 21 Jul 2003 10:48:51 -0700 Reply-To: Dale McLerran Sender: "SAS(r) Discussion" From: Dale McLerran Subject: Re: Odd Results with Proc Summary Missing Value assignment Comments: To: Arthur Tabachneck In-Reply-To: Content-Type: text/plain; charset=us-ascii Art, I believe that you are muddled on the purpose of the weight variable. You appear to be trying to follow a mathematical statement/argument, but the truth of the matter is that the weight variable is a statistical device. When you assign a weight value of zero, this is understood to mean that there is no informational content to the response which you are analyzing. If a person has weight value zero for every response, then there is no information present for computing a mean value. If there is no information present, then the mean should be zero. This is precisely what SAS returns, and every statistician would be up in arms if SAS did otherwise. It is entirely another matter to compute the mean of a product of two variables, which is the argument that you are persuing below. The product term must be computed in advance of invocation of PROC MEANS/SUMMARY. In fact, if you compute your own product term, your program will be much more compact and will execute much faster. This bonus comes on top of computation of the correct statistic. I demonstrate below how you might code your problem to produce the desired result. Since you indicate an insurance type problem, I have taken the liberty to rename the variables in your original presentation to indicate severity and frequency of claims. The total cost is (according to your presentation), frequency times severity. data one; input id severity1 severity2 severity3 freq1 freq2 freq3; cards ; 1 1 2 3 2 2 2 1 1 2 3 1 1 1 1 1 2 3 3 3 3 2 0 0 3 0 4 4 2 0 0 3 0 2 2 2 0 0 3 0 1 1 3 2 4 8 1 1 1 3 0 4 8 0 1 1 3 2 4 8 5 1 1 ; run; data two / view=two; set one; tot1 = freq1*severity1; tot2 = freq2*severity2; tot3 = freq3*severity3; keep id tot1-tot3; run; proc summary data=two; by id; var tot1-tot3; output out=three (DROP=_TYPE_ _FREQ_) mean=m1-m3; run; proc print data=three; run; --- Arthur Tabachneck wrote: > John, > > While I'm sure that Tim's logic closely resembles the decision rule > that > went into Proc Summary's design, I definitely don't agree with the > default > settings. > > The two most commonly used measures in the field of insurance are > frequency > (i.e., how often a claim occurs) and severity (i.e., the average cost > of a > claim). Everyone's contributing share to that pot is the product of > frequency times severity (e.g., if 100 out of a thousand have claims > which > average \$1,000 per claim, then we each have to put \$100 in the pot to > cover > the total cost of the anticipated claims. > > Where no claims occur, the average cost of a claim is 0 and the sum > of the > losses is 0, definitely not 'missing.' > > Similarly, in medicine, I would be extremely interested in a > treatment that > never has any fatalities. In fact, I can already see the lawsuits > coming if > analysts were to discount such results because SAS said the values > were > missing. > > Art ===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 --------------------------------------- __________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only \$29.95 per month! http://sbc.yahoo.com ```

Back to: Top of message | Previous page | Main SAS-L page