|
(100*0) is used to set the initial value of each of the 100 elements of
the array to zero.
________________________________
From: souga soga [mailto:souga1234@gmail.com]
Sent: Tuesday, December 18, 2007 9:01 AM
To: Huang, Ya
Cc: SAS-L@listserv.uga.edu
Subject: Re: selecting a unique set of data.
THANK YOU ALL VERY VERY MUCH, this is exactly what i need.
Just had one question :why is the (100*0) needed in the array statement
:-array yy(1:100) _temporary_ (100*0);
On 12/17/07, Ya Huang <ya.huang@amylin.com> wrote:
If NOT every value of X has every value of Y, or number of
observation
in each X varies, one way to get the unique Y is as follows,
i.e., pick
a Y for first x group, record it as used y, then for the next x
group,
check if the y has been used, if so, goes to next y in the
group...
temp array in the code is used to record used y value.
data have;
X="A001";Y=10;OUTPUT;
X="A001";Y=30;OUTPUT;
X="A001";Y=40;OUTPUT;
X="A003";Y=10;OUTPUT;
X="A003";Y=60;OUTPUT;
X="A003";Y=40;OUTPUT;
X="A003";Y=50;OUTPUT;
X="A002";Y=10;OUTPUT;
X="A002";Y=40;OUTPUT;
X="A002";Y=50;OUTPUT;
X="A004";Y=60;OUTPUT;
X="A004";Y=40;OUTPUT;
run;
data need;
array yy(1:100) _temporary_ (100*0);
set have;
retain found used;
by x notsorted;
if first.x then found=0;
if found=0 and yy(y)=0 then do;
yy(y)=1;
used=y;
found=1;
end;
if last.x then do;
if found=1 then y=used; else y=.;
output;
end;
run;
proc print;
run;
Obs X Y found used
1 A001 10 1 10
2 A003 60 1 60
3 A002 40 1 40
4 A004 . 0 40
Note that for A004, both 60 and 40 have been used, therefore
assign a missing for it.
On Mon, 17 Dec 2007 22:51:01 -0500, souga soga <
souga1234@GMAIL.COM> wrote:
>I apologize for not being specific, anyway what i need is in
your first
>paragraph
>
>"In your example every value of X has every value of Y. If
this is
>accurate then sort by X Y, and select first record in first
group of
>X's, second record from the second group, etc. to the last
group of
>X's taking the last record in the group."
>
>Unfortunately i do not know how to program this, can someone
help!
>
>Thanks again,
>Sa
>
>On Dec 17, 2007 10:03 PM, <iw1junk@comcast.net> wrote:
>
>> Summary: Specs should precede solution.
>> #iw-value=1
>>
>> Souga,
>>
>> In your example every value of X has every value of Y. If
this is
>> accurate then sort by X Y, and select first record in first
group of
>> X's, second record from the second group, etc. to the last
group of
>> X's taking the last record in the group.
>>
>> If your example is not accurate, then it makes no sense for
anyone to
>> give an answer until more is known about the requirements.
For
>> example, suppose there are 5 distinct value of X and 4
distinct values
>> of Y, then it is impossible to have every value of X in the
output
>> file no matter how the Y values are distributed.
Consequently there
>> is
>> no solution when all values of X must be used.
>>
>> So the real question is - what do you have, and what are the
>> requirements for what you want? Knowing why would also help
us to
>> know whether the problem is worth thinking about.
>>
>> It sounds like some kind of operations research problem.
>>
>> Ian Whitlock
>> ==============
>> Date: Mon, 17 Dec 2007 19:36:40 -0500
>> Reply-To: souga soga <souga1234@GMAIL.COM >
>> Sender: "SAS(r) Discussion"
>> From: souga soga <souga1234@GMAIL.COM>
>> Subject: selecting a unique set of data.
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> Hi,
>>
>> X="A001";Y=10;OUTPUT;
>> X="A001";Y=20;OUTPUT;
>> X="A002";Y=10;OUTPUT;
>> X="A002";Y=20;OUTPUT;
>>
>> I need the output to be:
>>
>> X="A001";Y=10;OUTPUT;
>> X="A002";Y=20;OUTPUT;
>>
>> ie. the value of y should not repeat for x
>>
>> Thanks, Sa
>>
>>
|