LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2003, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 5 Sep 2003 11:13:54 -0400
Reply-To:     rpresley <rpresley@GMCF.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         rpresley <rpresley@GMCF.ORG>
Subject:      REPOST:   Proc SQL and PROC SORT followed by data step merge prod
              uce different data sets -- why?
Content-Type: text/plain; charset="iso-8859-1"

The last post got a bit scrambled.

> SAS-L, > > One can join to data sets or tables with SQL view or one can mege the two > data sets in a data step. If one writes equivalent code then the > resulting data sets should be identical. I think I have writtten > equivalent code but the resulting data sets are not the same. Can somone > provide insight into why the code below is not equivalent? > > Thanks for any help. > > Rodney > > > > 187 proc sort data=stroke_2.stroke_main nodupkey ; > 188 by systemid; > 189 %runn; > > NOTE: SAS sort was used. > NOTE: 0 observations with duplicate key values were deleted. > NOTE: There were 930 observations read from the data set > STROKE_2.STROKE_MAIN. > NOTE: The data set STROKE_2.STROKE_MAIN has 930 observations and 201 > variables. > NOTE: Compressing data set STROKE_2.STROKE_MAIN decreased size by 78.52 > percent. > Compressed is 29 pages; un-compressed would require 135 pages. > NOTE: PROCEDURE SORT used: > real time 7.49 seconds > cpu time 0.16 seconds > > > "This code was run on 09/05/03 at 11:02 by rpresley" > 190 proc sort data=stroke_2.stroke_dual nodupkey ; > 191 by systemid; > 192 %runn; > > NOTE: SAS sort was used. > NOTE: 0 observations with duplicate key values were deleted. > NOTE: There were 85 observations read from the data set > STROKE_2.STROKE_DUAL. > NOTE: The data set STROKE_2.STROKE_DUAL has 85 observations and 341 > variables. > NOTE: PROCEDURE SORT used: > real time 1.30 seconds > cpu time 0.09 seconds > > > "This code was run on 09/05/03 at 11:02 by rpresley" > 193 data both ; > 194 merge stroke_2.stroke_main(keep=systemid in=m) > 195 stroke_2.stroke_dual(keep=systemid fname lname bdate in=d); > 196 main=m; > 197 dual=d; > 198 if m and d then output both; > 199 %runn; > > INFO: The variable SYSTEMID on data set STROKE_2.STROKE_MAIN will be > overwritten by data set > STROKE_2.STROKE_DUAL. > NOTE: There were 930 observations read from the data set > STROKE_2.STROKE_MAIN. > NOTE: There were 85 observations read from the data set > STROKE_2.STROKE_DUAL. > NOTE: The data set WORK.BOTH has 85 observations and 6 variables. > NOTE: DATA statement used: > real time 5.24 seconds > cpu time 0.14 seconds > > > "This code was run on 09/05/03 at 11:02 by rpresley" > 200 proc sql ; > 201 create table str_dual1 > 202 as select * > 203 from stroke_2.stroke_dual > 204 order by systemid > 205 ; > NOTE: Table WORK.STR_DUAL1 created, with 85 rows and 341 columns. > > 206 create table str_man1 /*only select records which were > re-abstracted*/ > > 207 as select * > 208 from stroke_2.stroke_main > 209 where systemid in (select distinct systemid from > stroke_2.stroke_dual) > 210 order by systemid; > NOTE: Table WORK.STR_MAN1 created, with 70 rows and 201 columns. > > 211 quit; > NOTE: PROCEDURE SQL used: > real time 4.38 seconds > cpu time 0.27 seconds > > > 212 %runn; > "This code was run on 09/05/03 at 11:05 by rpresley" > > Rodney J. Presley, PhD > Director of Data Analysis > Georgia Medical Care Foundation > 1455 Lincoln Parkway > suite 800 > Atlanta, GA 30346 > > 678-527-3474 > 678-527-3574 fax > > rpresley@gmcf.org >


Back to: Top of message | Previous page | Main SAS-L page