LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 1998, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 29 Jul 1998 17:56:12 -0400
Reply-To:   HERMANS1 <HERMANS1@WESTAT.COM>
Sender:   "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:   HERMANS1 <HERMANS1@WESTAT.COM>
Subject:   Re[2]: Data reshaping/rearranging question: job for SQL?
Comments:   To: Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Content-Type:   text/plain; charset=US-ASCII

While we are waiting for a response from the SQL expert, let me suggest that you have probably reduced the solution to an optimal form. Michael has asked in essence for a Cartesian product of the two subsets _NAME_ ^ _TYPE_="OBS" and _NAME_ ^ _TYPE_="VAR". Your program initially defines those sets as sources for a Cartesian product (evidenced by the absence of a WHERE or ON restriction on the main SELECT statement). The SAS log message merely tells you that a Cartesian product cannot be optimized. That seems obvious in this case.

While you would like to avoid creating a Cartesian product first and then subsetting it, in this case you actually want the Cartesian product as a final result. You have already partitioned the table in the OBS and VAR subsets before joining them into a Cartesian product. That should prove sufficient to optimize a query that links each element of two sets. Although I may in the past have described SAS SQL in such glowing terms that it sounds like magic, it simply provides a convenient way to express ordinary set logic.

To make it easier to extend your SQL solution, I would suggest that you replace the .. starting (WHERE=(_TYPE_= ...) with ..(SELECT DISTINCT * FROM starting)... in-line views to guard against duplicate rows of values that could play havoc with a Cartesian product. With that minor exception aside, it looks like model SQL code to me. Sig ______________________________ Thread _________________________________ Subject: Re: Data reshaping/rearranging question: job for SQL? Author: Peter Crawford <Peter@CRAWFORDSOFTWARE.DEMON.CO.UK> at Internet-E-Mail Date: 7/28/98 9:45 PM

In article <6pl911$6qj$1@sunburst.ccs.yorku.ca>, Michael Friendly <friendly@hotspur.psych.yorku.ca> writes >I have a data set which looks like this: > > _TYPE_ _NAME_ DIM1 DIM2 > > OBS High 0.18093 0.019248 > OBS Med 0.18500 0.011625 > OBS Low -0.28769 0.061994 > VAR WELL 0.25954 -0.012102 > VAR MODERATE -0.01421 0.069901 > VAR IMPAIRED -0.23739 -0.018897 > >I need to rearrange so that all pairs of OBS-VARs become the >observations of the new data set, with the values of DIM1 as X1 and Y1. >That is, the M + N input observations become M * N output observations, >with X1, Y1 as the value of DIM1 for each observation > > OBS VAR X1 Y1 > > High WELL 0.18093 0.25954 > High MODERATE 0.18093 -0.01421 > High IMPAIRED 0.18093 -0.23739 > Med WELL 0.18500 0.25954 > Med MODERATE 0.18500 -0.01421 > Med IMPAIRED 0.18500 -0.23739 > Low WELL -0.28769 0.25954 > Low MODERATE -0.28769 -0.01421 > Low IMPAIRED -0.28769 -0.23739 > >Seems like there's an easier way to do this with SQL than with data >step manipulations. Can anyone show me how? > >-- >Michael Friendly Internet: friendly@hotspur.psych.yorku.ca (NeXTmail OK) >Psychology Dept >York University Voice: 416 736-5115 Fax: 416 736-5814 >4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html >Toronto, ONT M3J 1P3 CANADA

As I prefer to avoid cartesian products, this is a challenge to my rusty sql .... so I thought I'ld push my $.02

create table joint as select a._name_ as obs ,b._name_ as var ,a.dim1 as x1 ,b.dim1 as y1 ,a.dim2 as x2 ,b.dim2 as y2 from starting( where=( _type_ ='OBS' )) as a ,starting( where=( _type_ ='VAR' )) as b ; SHOCK... it worked +FSVIEW: WORK.JOINT (B)---------------------------------------------+ | OBS OBS VAR X1 Y1 X2 Y2 | | | | 1 High WELL 0.18093 0.25954 0.019248 -0.012102 | | 2 High MODERATE 0.18093 -0.01421 0.019248 0.069901 | | 3 High IMPAIRED 0.18093 -0.23739 0.019248 -0.018897 | | 4 Med WELL 0.185 0.25954 0.011625 -0.012102 | | 5 Med MODERATE 0.185 -0.01421 0.011625 0.069901 | | 6 Med IMPAIRED 0.185 -0.23739 0.011625 -0.018897 | | 7 Low WELL -0.28769 0.25954 0.061994 -0.012102 | | 8 Low MODERATE -0.28769 -0.01421 0.061994 0.069901 | | 9 Low IMPAIRED -0.28769 -0.23739 0.061994 -0.018897 | | | +--------------------------------------------------------------------+ the log........... 40 data starting; 41 input _type_ $ _name_ $ dim1 dim2; 42 /* 43 length _TYPE_ $3 _NAME_ DIM1 DIM2 44 */ 45 list;cards;

RULE:----+----1----+----2----+----3----+----4----+----5----+----6----+- 46 OBS High 0.18093 0.019248 47 OBS Med 0.18500 0.011625 48 OBS Low -0.28769 0.061994 49 VAR WELL 0.25954 -0.012102 50 VAR MODERATE -0.01421 0.069901 51 VAR IMPAIRED -0.23739 -0.018897 NOTE: The data set WORK.STARTING has 6 observations and 4 variables. NOTE: The DATA statement used 0.16 seconds.

52 ; 53 /* target output 54 OBS VAR X1 Y1 55 High WELL 0.18093 0.25954 56 High MODERATE 0.18093 -0.01421 57 High IMPAIRED 0.18093 -0.23739 58 Med WELL 0.18500 0.25954 59 Med MODERATE 0.18500 -0.01421 60 Med IMPAIRED 0.18500 -0.23739 61 Low WELL -0.28769 0.25954 62 Low MODERATE -0.28769 -0.01421 63 Low IMPAIRED -0.28769 -0.23739 64 ; 65 66 Feeling just a bit hesitant here, so this shouldn't be exposed 67 */ 68 proc sql; 69 create table joint as 70 select a._name_ as obs 71 ,b._name_ as var 72 ,a.dim1 as x1 73 ,b.dim1 as y1 74 ,a.dim2 as x2 75 ,b.dim2 as y2 76 from starting( where=( _type_ ='OBS' )) as a 77 ,starting( where=( _type_ ='VAR' )) as b 78 ; NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized. NOTE: Table WORK.JOINT created, with 9 rows and 6 columns.

Now I've exposed my sql weakness, where is the sql specialist who produces simple sql that can be optimized ?

(I'm not sure a datastep would be much more complex than that sql )

-- Peter Crawford


Back to: Top of message | Previous page | Main SAS-L page