Date: Fri, 22 May 2009 12:51:35 -0500
Reply-To: Joe Matise <snoopy369@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Joe Matise <snoopy369@GMAIL.COM>
Subject: Re: Deleting rows in a data set...help required
In-Reply-To: <b7a7fa630905221040k1d64a6bajd1dd814ee3db45ac@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
In any event if you must have a non-11/12/22/21 solution, it would be
simplest:
data have;
array id[2000];
do _j = 1 to 5000;
do _n_ = 1 to 2000;
if mod(_j,10) = 5 then id[_n_] = 11;
else if mod(_j,10) = 8 then id[_n_] = 12;
else if mod(_j,10) = 6 and mod(_n_,2) = 1 then id[_n_] = 12;
else if mod(_j,10) = 6 and mod(_n_,2) = 0 then id[_n_] = 21;
else id[_n_] = (mod(_j,2)+1)*10 + mod(round(_n_,100)/100,2)+1;
end;
output;
end;
drop _:;
run;
data want;
set have;
array ids id:;
firstval = ids[1];
do _n_ = 1 to dim(ids);
if (ids[_n_] ne firstval) and (ids[_n_] ne reverse(firstval)) then leave;
end;
if _n_ le dim(ids) then output;
run;
On my machine, it took longer to create the initial dataset than to run the
second part by an appreciable amount (10 seconds versus 7).
-Joe
On Fri, May 22, 2009 at 12:40 PM, Joe Matise <snoopy369@gmail.com> wrote:
> The OP indicated that 12,21,11,22 were possible values. I assume boolean
> responses (genetics?).
>
> Of course if it is more complex than the OP states then it is a more
> complex problem, but that is the problem of the presenter, not of the solver
> :)
>
> -Joe
>
>
> On Fri, May 22, 2009 at 12:37 PM, Gerhard Hellriegel <
> gerhard.hellriegel@t-online.de> wrote:
>
>> still a problem with ID=15. Not if that is really all what can occure in
>> the dataset what's in the example (strange dataset, or bad example?)
>> Gerhard
>>
>>
>> On Fri, 22 May 2009 12:29:10 -0500, Joe Matise <snoopy369@GMAIL.COM>
>> wrote:
>>
>> >You can accomodate that by adding:
>> >if max(of id:) = 21 and min(of id:) = 12 then delete;
>> >since 11 is smaller and 22 is higher than those.
>> >
>> >-Joe
>> >
>> >On Fri, May 22, 2009 at 12:20 PM, Gerhard Hellriegel <
>> >gerhard.hellriegel@t-online.de> wrote:
>> >
>> >> ...but you won't get out the 12 / 21 rows with that.
>> >> Gerhard
>> >>
>> >>
>> >> On Fri, 22 May 2009 13:10:04 -0400, Nat Wooding
>> >> <Nathaniel.Wooding@DOM.COM> wrote:
>> >>
>> >> >Dinesh
>> >> >
>> >> >Here, we can use the max and min functions and the SAS automatic
>> variable
>> >> >_numeric_ which refers to all of the numeric variables in a data set.
>> >> >
>> >> >Data D;
>> >> >input id1 id2 id3 id4 ;
>> >> >cards;
>> >> > 11 12 22 11
>> >> > 22 11 12 12
>> >> > 11 11 11 11
>> >> > 12 21 12 21
>> >> > 11 22 12 12
>> >> > 12 21 22 11
>> >> > 11 22 22 22
>> >> > 22 22 22 22
>> >> > 12 21 22 11
>> >> > 11 22 11 21
>> >> > run;
>> >> > Data Wanted;
>> >> > set D;
>> >> > if max(of _numeric_) = min( of _numeric_) then delete;
>> >> >run;
>> >> >proc print;
>> >> >run;
>> >> >
>> >> >Nat Wooding
>> >> >Environmental Specialist III
>> >> >Dominion, Environmental Biology
>> >> >4111 Castlewood Rd
>> >> >Richmond, VA 23234
>> >> >Phone:804-271-5313, Fax: 804-271-2977
>> >> >
>> >> >
>> >> >
>> >> > Dinesh
>> >> > <mtdinesh@GMAIL.C
>> >> >
>> OM> To
>> >> > Sent by: "SAS(r) SAS-L@LISTSERV.UGA.EDU
>> >> >
>> Discussion" cc
>> >> > <SAS-L@LISTSERV.U
>> >> > GA.EDU>
>> >> Subject
>> >> > Deleting rows in a data
>> set...help
>> >> > required
>> >> > 05/22/2009 12:21
>> >> > PM
>> >> >
>> >> >
>> >> > Please respond to
>> >> > Dinesh
>> >> > <mtdinesh@GMAIL.C
>> >> > OM>
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >Dear All,
>> >> >
>> >> >I have some problems with my analysis..
>> >> >
>> >> >I have a data set with around 2000 columns and 50000 rows....
>> >> >
>> >> >the dataset appears like this...
>> >> >
>> >> > id1 id2 id3 id4---------
>> >> >1 11 12 22 11
>> >> >2 22 11 12 12
>> >> >3 11 11 11 11
>> >> >4 12 21 12 21
>> >> >5 11 22 12 12
>> >> >6 12 21 22 11
>> >> >7 11 22 22 22
>> >> >8 22 22 22 22
>> >> >9 12 21 22 11
>> >> >10 11 22 11 21
>> >> >-
>> >> >-
>> >> >-
>> >> >-
>> >> >
>> >> >Now..what i need is that, if a particular row contain same values
>> >> >throughout the 2000 columns i want to delete it.
>> >> >So if a row contains all 11 or all 22 or all 12 or all 21 it should be
>> >> >deleted... also 12 and 21 are same and if a row contains only 12 and
>> >> >21 it can also be deleted...
>> >> >
>> >> >so the final output will appear like
>> >> >
>> >> > id1 id2 id3 id4------
>> >> >1 11 12 22 11
>> >> >2 22 11 12 12
>> >> >5 11 22 12 12
>> >> >6 12 21 22 11
>> >> >7 11 22 22 22
>> >> >9 12 21 22 11
>> >> >10 11 22 11 21
>> >> >-
>> >> >-
>> >> >-
>> >> >-
>> >> >
>> >> >rows 3 , 4 and 8 should be deleted...
>> >> >
>> >> >
>> >> >Please help me to solve this
>> >> >
>> >> >Thanks
>> >> >
>> >> >Dinu
>> >> >
>> >> >
>> >> >CONFIDENTIALITY NOTICE: This electronic message contains
>> >> >information which may be legally confidential and or privileged and
>> >> >does not in any case represent a firm ENERGY COMMODITY bid or offer
>> >> >relating thereto which binds the sender without an additional
>> >> >express written confirmation to that effect. The information is
>> >> >intended solely for the individual or entity named above and access
>> >> >by anyone else is unauthorized. If you are not the intended
>> >> >recipient, any disclosure, copying, distribution, or use of the
>> >> >contents of this information is prohibited and may be unlawful. If
>> >> >you have received this electronic transmission in error, please
>> >> >reply immediately to the sender that you have received the message
>> >> >in error, and delete it. Thank you.
>> >>
>>
>
>
|