Date: Thu, 14 Feb 2008 14:57:52 -0600
Reply-To: Mary <mlhoward@avalon.net>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Need more elegant approach to substring problem
Content-Type: text/plain; charset="iso-8859-1"
Hello all,
I need a better solution!
Would anyone have a less tedious solution to this code? I've got a haplotype coded as a string of 0, 1, 2. If the character is a 0, then I want to substitute the first character of the target allele twice (i.e. Homo for Allele 1), if it is a 1, then I use the first and second character (Hetero), and if it is a 2, then use the second of the target alleles twice (Home for Allele 2).
This works, but I'd like to make it generalizable up to at least 40 characters of the haplotype (and thus the target allele field would have exactly twice that).
-Mary
data have;
informat haplotype $40. target_alleles $80.;
infile cards;
input haplotype target_alleles;
cards;
11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG
02000200 GTCTCTATGTGTATCTCTAGATGTAGACAG
01010011 GTCTCTATGTGTATCTCTAGATGTAGACAG
11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG
02002102 GTCTCTATGTGTATCTCTAGATGTAGACAG
00000020 GTCTCTATGTGTATCTCTAGATGTAGACAG
run;
data need;
informat haplotype1 $2. haplotype2 $2. haplotype3 $2. new_haplotype $80.;
set have;
if length(haplotype) >= 1 then
do;
if substr(haplotype,1,1)='0' then
haplotype1= substr(target_alleles,1,1) || substr(target_alleles,1,1);
else if substr(haplotype,1,1)='1' then
haplotype1=substr(target_alleles,1,1) || substr(target_alleles,2,1);
else if substr(haplotype,1,1)='2' then
haplotype1= substr(target_alleles,2,1) || substr(target_alleles,2,1);
end;
else
haplotype1= ' ';
if length(haplotype) >= 2 then
do;
if substr(haplotype,2,1)='0' then
haplotype2= substr(target_alleles,3,1) || substr(target_alleles,3,1);
else if substr(haplotype,2,1)='1' then
haplotype2= substr(target_alleles,3,1) || substr(target_alleles,4,1);
else if substr(haplotype,2,1)='2' then
haplotype2= substr(target_alleles,4,1) || substr(target_alleles,4,1);
end;
else
haplotype2= ' ';
if length(haplotype) >= 3 then
do;
if substr(haplotype,3,1)='0' then
haplotype3= substr(target_alleles,5,1) || substr(target_alleles,5,1);
else if substr(haplotype,3,1)='1' then
haplotype3= substr(target_alleles,5,1) || substr(target_alleles,6,1);
else if substr(haplotype,3,1)='2' then
haplotype3=substr(target_alleles,6,1) || substr(target_alleles,6,1);
end;
else
haplotype3= ' ';
new_haplotype=haplotype1 || haplotype2 || haplotype3;
run;