LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 24 Aug 2006 16:20:13 -0400
Reply-To:     "Dorfman, Paul" <paul_dorfman@MERCK.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Dorfman, Paul" <paul_dorfman@MERCK.COM>
Subject:      Re: Fill in missing score
Comments: To: "data _null_;" <datanull@gmail.com>
Content-Type: text/plain

update work.score(obs=0) work.score? Very sleek.

Now, if I may... on a purely aesthetic side... mestillthinks that

do freq = 1 by 1 until(last.id); update work.score(obs=0) work.score; by id; end;

is preferable to

freq = 0; do until(last.id); update work.score(obs=0) work.score; by id; freq = freq + 1; end;

On a yet different note, I have noticed over time that you always use two-level names for data files in the WORK library. I wish to question the wisdom of doing so here. My arguments against it are two-fold. Using one-level names enables one to switch between libraries to which the files are written at any point in the program by using either the option user= or defining the USER libref. To wit, to switch to a library with a [already defined] libref DIFF one would merely code

option user = diff ;

or to define a libref USER before the program's onset (e.g. in the config or autoexec) and have all intermediate data sets stored in a permanent library instead of WORK without any changes to the program whatsoever. Then they would be viewable after the program has finished or, especially, aborted. In the latter case, the viewability of the temp files, including the one which may have been only partially written, is a great aid in debugging. Once the files in the library are not needed, they can be deleted from the library, or the whole thing could be killed altogether. (In fact, in my own practice I have made it a rule to never write any program-generated files to WORK but only to a permanent library handled in the above-described manner. It leaves WORK all to SAS to use, and I do not have to share its space with anyone else on systems where it is shared. And the ability to examine intermediate files after a job as abended has saved me untold hours of pulling what's left of my hair.). Secondly, one-level names are, well, shorter to code.

What are the pros of using two-level names, in your opinion?

Kind regards ------------ Paul Dorfman Jax, FL ------------

+-----Original Message----- +From: data _null_; [mailto:datanull@gmail.com] +Sent: Thursday, August 24, 2006 1:29 PM +To: Dorfman, Paul +Cc: SAS-L@listserv.uga.edu +Subject: Re: Fill in missing score + + +Your comments about the score not being on the first obs got me to +thinking. I also thought there might be other "scores". I came up +with this. + +data work.score; + infile cards missover; + input ID Score score2; + cards; +12225 0.365516711 +12225 . 0.365516711 +13073 . 0.365516711 +13073 0.32885697 +13073 +15494 0.466036457 +15494 . 0.466036457 +33501 0.159729592 0.466036457 +33501 +;;;; + run; +proc print; + run; +data work.score0; + freq = 0; + do until(last.id); + update work.score(obs=0) work.score; + by id; + freq = freq + 1; + end; + do i = 1 to freq; + output; + end; + drop freq i; + run; +proc print; + run; + + + + +On 8/24/06, Dorfman, Paul <paul_dorfman@merck.com> wrote: +> Thien, +> +> The fine solutions by Ken and Toby have the advantage of +reading the file +> once. A more generic solution would read it twice but would also be +> impervious to the situation where a non-missing score would +happen to be +> located not necessarily in the fist record of each ID by-group: +> +> data a ; +> input id score ; +> cards ; +> 1 11 +> 1 . +> 2 . +> 2 22 +> 2 . +> 3 . +> 3 33 +> ; +> run ; +> +> data b ; +> merge a (drop = score) a (where = (score is not null)) ; +> by id ; +> run ; +> +> Alternatively (for a Nothin'-But-SQL ), +> +> proc sql ; +> create table c as +> select x.id, y.score +> from a x, a y +> where x.id = y.id and y.score is not null +> ; +> quit ; +> +> Of course, if a non-missing score is in the middle of a +by-group, you can +> still use the DoW-loop, only the file will still have to be +read twice: +> +> data d ; +> do _n_ = 1 by 1 until (last.id) ; +> set a ; +> by id ; +> if not missing (score) then _iorc_ = score ; +> end ; +> score = _iorc_ ; +> do _n_ = 1 to _n_ ; +> set a (drop = score) ; +> output ; +> end ; +> run ; +> +> Kind regards +> ------------ +> Paul Dorfman +> Jax, FL +> ------------ +> +> +> +-----Original Message----- +> +From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On +> +Behalf Of Thien Thai +> +Sent: Thursday, August 24, 2006 8:46 AM +> +To: SAS-L@LISTSERV.UGA.EDU +> +Subject: Fill in missing score +> + +> + +> +Hello, I'm a new SAS user and stumbled across this problem +> +where I need to +> +fill in the missing score for id that are the same. The data +> +set looks like this +> + +> +ID Score +> +12225 0.365516711 +> +12225 +> +13073 0.32885697 +> +13073 +> +13073 +> +15494 0.466036457 +> +15494 +> +33501 0.159729592 +> +33501 +> + +> +and basically I would like to have the same score assign to ID +> +that are the +> +same, any help would be much appreciated. +> + +> +Ta +> + +> +Thien +> + +> + +> +> +> +--------------------------------------------------------------- +--------------- +> Notice: This e-mail message, together with any attachments, contains +> information of Merck & Co., Inc. (One Merck Drive, +Whitehouse Station, +> New Jersey, USA 08889), and/or its affiliates (which may be known +> outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD +> and in Japan, as Banyu - direct contact information for affiliates is +> available at http://www.merck.com/contact/contacts.html) that may be +> confidential, proprietary copyrighted and/or legally +privileged. It is +> intended solely for the use of the individual or entity named on this +> message. If you are not the intended recipient, and have +received this +> message in error, please notify us immediately by reply +e-mail and then +> delete it from your system. +> +> +--------------------------------------------------------------- +--------------- +> + +

------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as Banyu - direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.

------------------------------------------------------------------------------


Back to: Top of message | Previous page | Main SAS-L page