LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous (more recent) messageNext (less recent) messagePrevious (more recent) in topicNext (less recent) in topicPrevious (more recent) by same authorNext (less recent) by same authorPrevious page (March 1999, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 14 Mar 1999 10:38:29 -0600
Reply-To:     "Dwight M. Buffum" <dmbuffum@MN.MEDIAONE.NET>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         "Dwight M. Buffum" <dmbuffum@MN.MEDIAONE.NET>
Organization: MindTools, Inc.
Subject:      Re: SQL: One more related question: SQL question Cartesian
              Product Question ?
Comments: To: "Seltzer, Jon D." <SeltzerJD@PHIBRED.COM>
Content-Type: multipart/mixed;

[multipart/alternative]


I think this fulfills your request.

data data1; input gene1 $ marke1 alle1; datalines; A 1 .1 A 2 .2 A 3 .3 B 1 .4 B 2 .5 B 3 .6 C 1 .7 C 2 .8 C 3 .9 ; run;

data data2; input gene2 $ marke2 alle2; datalines; A 1 .1 A 2 .2 A 3 .3 B 1 .4 B 2 .5 B 3 .6 C 1 .7 C 2 .8 C 3 .9 ; run;

proc sql; create table data3 as select * from data1 d1, data2 d2 where d1.marke1=d2.marke2 and d1.gene1<d2.gene2 order by d1.marke1, d1.gene1, d2.gene2 ; quit;

"Seltzer, Jon D." wrote:

> First my thanks to everyone that responded to my original question, you are > all helping me get out of a monster programming jam. > > All of the solutions were orders of magnitude more efficient than what I was > trying to do. Hopefully I am on my way to writing a program that takes > 1/2hour to run Vs 40 hours to run. > > Below is my original question. > In working on this problem a bit more, I suddenly realized > in reality I want to get only one record from a representative pair from the > output at the bottom > of this email. I thought all pairs would be okay but now I realize that is > not the case. > > Once again any help would be greatly appreciated. > > So now the final data set I would like to obtain is: > > gene1 mark1 alle1 gene2 mark2 alle2 > A 1 .1 B 1 .4 > > A 1 .1 C 1 .7 > > B 1 .4 C 1 .7 > > A 2 .2 B 2 .5 > > A 2 .2 C 2 .8 > > B 2 .5 C 2 .8 > > A 3 .3 B 3 .6 > > A 3 .3 C 3 .9 > > B 3 .6 C 3 .9 > > If you compare this to the past final data set below you will notice that > the second record has been deleted in each pair of records. I really have > no clue how to do > this without lots of data manipulation. The actual file I am working on is > 2-4 million records, > which is why I am looking for more efficient code. > > The original problem. > > Suppose data1 is: > > gene1 marke1 alle1 > A 1 .1 > A 2 .2 > A 3 .3 > B 1 .4 > B 2 .5 > B 3 .6 > C 1 .7 > C 2 .8 > C 3 .9 > > Data2 is the same as data1 but with different variable names. > > gene2 marke2 alle2 > A 1 .1 > A 2 .2 > A 3 .3 > B 1 .4 > B 2 .5 > B 3 .6 > C 1 .7 > C 2 .8 > C 3 .9 > > What I want to create is data3: For each marker, I want all pairs of gene1 > and gene2 excluding the case when gene1=gene2. I would also be happy with > just the unique pairs if that is easier to do. > > gene1 mark1 alle1 gene2 mark2 alle2 > A 1 .1 B 1 .4 > B 1 .4 A 1 .1 > > A 1 .1 C 1 .7 > C 1 .7 A 1 .1 > > B 1 .4 C 1 .7 > C 1 .7 B 1 .4 > > A 2 .2 B 2 .5 > B 2 .5 A 2 .2 > > A 2 .2 C 2 .8 > C 2 .8 A 2 .2 > > B 2 .5 C 2 .8 > C 2 .8 B 2 .5 > > A 3 .3 B 3 .6 > B 3 .6 A 3 .3 > > A 3 .3 C 3 .9 > C 3 .9 A 3 .3 > > B 3 .6 C 3 .9 > C 3 .9 B 3 .6


[text/html]

[text/x-vcard]

begin:vcard n:Buffum;Dwight x-mozilla-html:TRUE org:MindTools, Inc.;A SAS Institute Quality Partner adr:;;2285 Stewart Av.., #1327;St. Paul;Minnesota;55116-3154; version:2.1 email;internet:dmbuffum@mn.mediaone.net title:President fn:Dwight M. Buffum end:vcard


Back to: Top of message | Previous page | Main SAS-L page