LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 9 Apr 2008 12:55:02 -0700
Reply-To:     Peter <crawfordsoftware@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter <crawfordsoftware@GMAIL.COM>
Subject:      Re: data set update
Comments: To:
Content-Type: text/plain; charset=ISO-8859-1

On Apr 9, 6:38 pm, datan...@GMAIL.COM ("data _null_,") wrote: > You are not suppose to do that you know. But here's how. > > data class; > set sashelp.class; > if _n_ eq 1 then age = .; > if _n_ eq 17 then age = .; > if _n_ eq 4 then weight = .; > if _n_ eq 6 then height = .; > run; > proc print; > run; > proc stdize reponly missing=mean; > run; > proc print; > run; > > > > On Wed, Apr 9, 2008 at 12:28 PM, Alex Tang <> wrote: > > Hi there, > > > Here I get a question of data manipulation. > > > I have two data sets, say, couple of hundred numerical variables. Both > > of them have same variables. > > > First one is the data set I am working on. It's large, say million > > observation. > > > Second on is simply the mean of each variable of the first one. > > > Because first one has missing value, I want to substitute any missing > > value with the average value of that variable corresponding in the > > second data set. > > > Is there any easy way to do this instead of typing in each variable and > > average value in assignment statement in if clause? > > > I am thinking of PROC SQL UPDATE statement but a little messed up. > > > Thanks, > > > Alex > > > ******************* E-mail non-disclosure ****************** > > > The information contained in this e-mail message may be proprietary and/or confidential, and > > protected from disclosure. If the reader of this message is not the intended recipient, > > or an employee or agent responsible for delivering this message to the intended recipient, > > you are hereby notified that any dissemination, distribution or copying of this communication > > is strictly prohibited. If you have received this communication in error, please notify > > Credit One Bank immediately by replying to this message and delete the original message. Thank you.- Hide quoted text - > > - Show quoted text -

that proc stdize solution looks really great, but needs SAS/STAT among the modules you license.

Another way to achieve the objective might be to try a data step like. data original_data ; if _n_ =1 then do; set means_data(obs=1) ; array meansd _numeric_ ; array meanst( 1000 ) _temporary_; * larger number if >1000 varables; * now transfer the means into the temp array ; do over meansd; meanst(_i_) = meansd ; end; end; modify original_data ; if nmiss( of meansd(*) ) then do; do over meansd ; meansd = coalesce( meansd, meanst(_i_) ); end; replace ; end; run; When the step suns, only rows with at least one missing value will be replaced. The NOTEs in the log at the end of the step will indicate how many rows needed to be replaced.

Kindly, data_NULL_ has provided a test-data builder. That, and building a means_data set follow data original_data ; set sashelp.class; if _n_ eq 1 then age = .; if _n_ eq 17 then age = .; if _n_ eq 4 then weight = .; if _n_ eq 6 then height = .; run;

proc means noprint; output mean= out= means_data(drop= _: ) ; run;

I like modify in this situation because you have only one copy of large data and it runs quickly. However, beware that you may have a problem if the modify step fails during execution, as part of your data is updated and part is not.

I don't think it i a major porblem here, depending on your SAS platform's stability and security (you have a back-up of teh data don't you? ;-)

Good Luck


Back to: Top of message | Previous page | Main SAS-L page