Date: Wed, 8 Nov 2006 23:38:52 -0500
Reply-To: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject: Re: Merging for further counting
On Wed, 8 Nov 2006 12:02:51 -0800, Irin later <irinfigvam@YAHOO.COM> wrote:
>Art, thank you very much. I just wonder.....
> may it be any scenario when the Cumulative frequency very last value is
longer the longest of the input files?
>
> For example, can it be like below (Proc Freq data=all; tables total;
run;) while I know that the longest file Glucose has just 496 records?
>
>
> Cumulative
Cumulative
> total Frequency Percent Frequency Percent
>
> 1 14 2.73 14 2.73
> 2 31 6.04 45 8.77
> 3 73 14.23 118 23.00
> 4 124 24.17 242 47.17
> 5 170 33.14 412 80.31
> 6 101 19.69 513 100.00
>
>
> In other words my question is:
> Can dataset ALL generated my the code be more then any other dataset
which were used to create ALL by merging?
Of course. I modified one step Art's example:
data glucose;
input id bmem2;
cards;
1 1
4 1
5 1
;
Then the merge shows
NOTE: There were 3 observations read from the data set WORK.GLUCOSE.
NOTE: There were 2 observations read from the data set WORK.SUGAR.
NOTE: There were 2 observations read from the data set WORK.HMGL.
NOTE: There were 1 observations read from the data set WORK.URINTEST.
NOTE: There were 2 observations read from the data set WORK.EYE.
NOTE: There were 2 observations read from the data set WORK.FOOT.
NOTE: The data set WORK.ALL has 5 observations and 8 variables.
Notice that the longest input has 3 obs. but the output has 5.
>
> Thank you very much for your help!
>
> Irin
>
>
>Arthur Tabachneck <art297@NETSCAPE.NET> wrote: Irin,
>
>From your description I'm not at all sure I know what you are trying to
>accomplish. However, that said, hopefully the following example at least
>comes close:
>
>data glucose;
>input id bmem2;
>cards;
>1 1
>2 0
>3 0
>4 1
>5 1
>;
>run;
>data sugar;
>input id bmem3;
>cards;
>2 1
>3 0
>;
>run;
>data Hmgl;
>input id bmem4;
>cards;
>1 0
>2 1
>;
>run;
>data UrinTest;
>input id bmem5;
>cards;
>1 1
>;
>run;
>data Eye;
>input id bmem6;
>cards;
>3 1
>4 1
>;
>run;
>data Foot;
>input id bmem7;
>cards;
>2 1
>4 1
>;
>run;
>data all (drop=i);
>array bmem (6) bmem2-bmem7;
>merge glucose sugar Hmgl UrinTest Eye Foot;
>by id;
>total=0;
>do i=1 to 6;
>if bmem(i) eq . then bmem(i)=0;
>total+bmem(i);
>end;
>run;
>
>Art
>---------
>On Tue, 7 Nov 2006 16:05:04 -0800, Irin later wrote:
>
>>I have six datasets glucose; sugar; Hmgl; UrinTest; Eye; Foot
>> Each of wich have two fields :
>>
>> Glucose: ID; Bmem2 ? 457 records
>> Sugar: ID; Bmem3? 408 records
>> Hmgl ID; Bmem4? 399 records
>> UrinTest ID; Bmem5? 359 records
>> Eye ID; Bmem6? 383 records
>> Foot ID; Bmem7? 241 records
>>
>> All Bmem fields have values = 1 (number type) This value means
>positive response
>>
>> My purpose is to obtain a measure of the burden of positive answers. In
>other words, I need to count answers 1-6 in the table in order to get the
>counting based on the biggest dataset. For example:
>>
>> For 457
>>
>> 0 1.5%
>> 1 8%
>> 2 5.5%
>> 3 10%
>> 4 15%
>> 5 30%
>> 6 20
>>
>> Actually, I have two questions and I am confused with both:
>>
>> 1.How to merge these datasets with different length horizontally by
>ID in order to keep all information
>>
>> 2 How to implement the counting I mentioned below? The thing is that I
>cannot use array as long as datasets have different sizes
>>
>> Could you give me a hand? Thank you very much in advance!!!
>>
>> Irin
>>
>>
>>---------------------------------
>>Cheap Talk? Check out Yahoo! Messenger's low PC-to-Phone call rates.
>
>
>
>---------------------------------
>Everyone is raving about the all-new Yahoo! Mail.