Date: Thu, 14 Feb 2008 15:43:12 -0600
Reply-To: Mary <mlhoward@avalon.net>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Re: Need more elegant approach to substring problem
Content-Type: text/plain; charset="iso-8859-1"
Not perfect, but pretty darn close! I removed the - 1 from the loop and also added code for the missing, then it was exactly what I needed! Thank you so much!
-Mary
Below is the fixed code:
Data wanted;
set have;
length new_haplotype $ 80 ;
do i = 1 to length( haplotype );
string = substr( haplotype , i , 1 );
Allele = substr( target_alleles , i*2 - 1 , 2 );
a1 = substr(allele , 1 , 1);
a2 = substr(allele , 2);
if string = '0' then
new_haplotype = catt( new_haplotype , a1, a1 );
else if string = '2' then
new_haplotype = catt( new_haplotype , a2, a2 );
else if string = '1' then
new_haplotype = catt( new_haplotype , allele );
else
new_haplotype = catt (new_haplotype, ' ');
end;
run;
----- Original Message -----
From: Nat Wooding
To: SAS-L@LISTSERV.UGA.EDU
Sent: Thursday, February 14, 2008 3:29 PM
Subject: Re: Need more elegant approach to substring problem
Mary
Try the following. If it works, it will be the only code that I have
written this afternoon that does what I want it to do. This does require
V9.
data have;
informat haplotype $40. target_alleles $80.;
infile cards;
input haplotype target_alleles;
cards;
11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG
02000200 GTCTCTATGTGTATCTCTAGATGTAGACAG
01010011 GTCTCTATGTGTATCTCTAGATGTAGACAG
11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG
02002102 GTCTCTATGTGTATCTCTAGATGTAGACAG
00000020 GTCTCTATGTGTATCTCTAGATGTAGACAG
run;
Data wanted;
set have;
length new_haplotype $ 80 ;
do i = 1 to length( haplotype ) - 1;
string = substr( haplotype , i , 1 );
Allele = substr( target_alleles , i*2 - 1 , 2 );
a1 = substr(allele , 1 , 1);
a2 = substr(allele , 2);
if string = '0' then new_haplotype = catt( new_haplotype , a1
, a1 );else
if string = '2' then new_haplotype = catt( new_haplotype , a2
, a2 );else
new_haplotype = catt( new_haplotype , allele );
end;
run;
Proc Print;
run;
Nat Wooding
Environmental Specialist III
Dominion, Environmental Biology
4111 Castlewood Rd
Richmond, VA 23234
Phone:804-271-5313, Fax: 804-271-2977
Mary
<mlhoward@AVALON.
NET> To
Sent by: "SAS(r) SAS-L@LISTSERV.UGA.EDU
Discussion" cc
<SAS-L@LISTSERV.U
GA.EDU> Subject
Need more elegant approach to
substring problem
02/14/2008 03:57
PM
Please respond to
Mary
<mlhoward@avalon.
net>
Hello all,
I need a better solution!
Would anyone have a less tedious solution to this code? I've got a
haplotype coded as a string of 0, 1, 2. If the character is a 0, then I
want to substitute the first character of the target allele twice (i.e.
Homo for Allele 1), if it is a 1, then I use the first and second character
(Hetero), and if it is a 2, then use the second of the target alleles twice
(Home for Allele 2).
This works, but I'd like to make it generalizable up to at least 40
characters of the haplotype (and thus the target allele field would have
exactly twice that).
-Mary
data have;
informat haplotype $40. target_alleles $80.;
infile cards;
input haplotype target_alleles;
cards;
11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG
02000200 GTCTCTATGTGTATCTCTAGATGTAGACAG
01010011 GTCTCTATGTGTATCTCTAGATGTAGACAG
11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG
02002102 GTCTCTATGTGTATCTCTAGATGTAGACAG
00000020 GTCTCTATGTGTATCTCTAGATGTAGACAG
run;
data need;
informat haplotype1 $2. haplotype2 $2. haplotype3 $2. new_haplotype
$80.;
set have;
if length(haplotype) >= 1 then
do;
if substr(haplotype,1,1)='0' then
haplotype1= substr(target_alleles,1,1) ||
substr(target_alleles,1,1);
else if substr(haplotype,1,1)='1' then
haplotype1=substr(target_alleles,1,1) ||
substr(target_alleles,2,1);
else if substr(haplotype,1,1)='2' then
haplotype1= substr(target_alleles,2,1) ||
substr(target_alleles,2,1);
end;
else
haplotype1= ' ';
if length(haplotype) >= 2 then
do;
if substr(haplotype,2,1)='0' then
haplotype2= substr(target_alleles,3,1) ||
substr(target_alleles,3,1);
else if substr(haplotype,2,1)='1' then
haplotype2= substr(target_alleles,3,1) ||
substr(target_alleles,4,1);
else if substr(haplotype,2,1)='2' then
haplotype2= substr(target_alleles,4,1) ||
substr(target_alleles,4,1);
end;
else
haplotype2= ' ';
if length(haplotype) >= 3 then
do;
if substr(haplotype,3,1)='0' then
haplotype3= substr(target_alleles,5,1) ||
substr(target_alleles,5,1);
else if substr(haplotype,3,1)='1' then
haplotype3= substr(target_alleles,5,1) ||
substr(target_alleles,6,1);
else if substr(haplotype,3,1)='2' then
haplotype3=substr(target_alleles,6,1) ||
substr(target_alleles,6,1);
end;
else
haplotype3= ' ';
new_haplotype=haplotype1 || haplotype2 || haplotype3;
run;
-----------------------------------------
CONFIDENTIALITY NOTICE: This electronic message contains
information which may be legally confidential and/or privileged and
does not in any case represent a firm ENERGY COMMODITY bid or offer
relating thereto which binds the sender without an additional
express written confirmation to that effect. The information is
intended solely for the individual or entity named above and access
by anyone else is unauthorized. If you are not the intended
recipient, any disclosure, copying, distribution, or use of the
contents of this information is prohibited and may be unlawful. If
you have received this electronic transmission in error, please
reply immediately to the sender that you have received the message
in error, and delete it. Thank you.