LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 14 Feb 2008 15:43:12 -0600
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: Need more elegant approach to substring problem
Comments: To: Nathaniel.Wooding@DOM.COM
Content-Type: text/plain; charset="iso-8859-1"

Not perfect, but pretty darn close! I removed the - 1 from the loop and also added code for the missing, then it was exactly what I needed! Thank you so much!

-Mary

Below is the fixed code:

Data wanted;

set have;

length new_haplotype $ 80 ;

do i = 1 to length( haplotype );

string = substr( haplotype , i , 1 );

Allele = substr( target_alleles , i*2 - 1 , 2 );

a1 = substr(allele , 1 , 1);

a2 = substr(allele , 2);

if string = '0' then

new_haplotype = catt( new_haplotype , a1, a1 );

else if string = '2' then

new_haplotype = catt( new_haplotype , a2, a2 );

else if string = '1' then

new_haplotype = catt( new_haplotype , allele );

else

new_haplotype = catt (new_haplotype, ' ');

end;

run;

----- Original Message ----- From: Nat Wooding To: SAS-L@LISTSERV.UGA.EDU Sent: Thursday, February 14, 2008 3:29 PM Subject: Re: Need more elegant approach to substring problem

Mary

Try the following. If it works, it will be the only code that I have written this afternoon that does what I want it to do. This does require V9.

data have; informat haplotype $40. target_alleles $80.; infile cards; input haplotype target_alleles; cards; 11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG 02000200 GTCTCTATGTGTATCTCTAGATGTAGACAG 01010011 GTCTCTATGTGTATCTCTAGATGTAGACAG 11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG 02002102 GTCTCTATGTGTATCTCTAGATGTAGACAG 00000020 GTCTCTATGTGTATCTCTAGATGTAGACAG run;

Data wanted; set have; length new_haplotype $ 80 ; do i = 1 to length( haplotype ) - 1; string = substr( haplotype , i , 1 ); Allele = substr( target_alleles , i*2 - 1 , 2 ); a1 = substr(allele , 1 , 1); a2 = substr(allele , 2); if string = '0' then new_haplotype = catt( new_haplotype , a1 , a1 );else if string = '2' then new_haplotype = catt( new_haplotype , a2 , a2 );else new_haplotype = catt( new_haplotype , allele ); end; run; Proc Print; run;

Nat Wooding Environmental Specialist III Dominion, Environmental Biology 4111 Castlewood Rd Richmond, VA 23234 Phone:804-271-5313, Fax: 804-271-2977

Mary <mlhoward@AVALON. NET> To Sent by: "SAS(r) SAS-L@LISTSERV.UGA.EDU Discussion" cc <SAS-L@LISTSERV.U GA.EDU> Subject Need more elegant approach to substring problem 02/14/2008 03:57 PM

Please respond to Mary <mlhoward@avalon. net>

Hello all,

I need a better solution!

Would anyone have a less tedious solution to this code? I've got a haplotype coded as a string of 0, 1, 2. If the character is a 0, then I want to substitute the first character of the target allele twice (i.e. Homo for Allele 1), if it is a 1, then I use the first and second character (Hetero), and if it is a 2, then use the second of the target alleles twice (Home for Allele 2).

This works, but I'd like to make it generalizable up to at least 40 characters of the haplotype (and thus the target allele field would have exactly twice that).

-Mary

data have;

informat haplotype $40. target_alleles $80.;

infile cards;

input haplotype target_alleles;

cards;

11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG

02000200 GTCTCTATGTGTATCTCTAGATGTAGACAG

01010011 GTCTCTATGTGTATCTCTAGATGTAGACAG

11100110 GTCTCTATGTGTATCTCTAGATGTAGACAG

02002102 GTCTCTATGTGTATCTCTAGATGTAGACAG

00000020 GTCTCTATGTGTATCTCTAGATGTAGACAG

run;

data need;

informat haplotype1 $2. haplotype2 $2. haplotype3 $2. new_haplotype $80.;

set have;

if length(haplotype) >= 1 then

do;

if substr(haplotype,1,1)='0' then

haplotype1= substr(target_alleles,1,1) || substr(target_alleles,1,1);

else if substr(haplotype,1,1)='1' then

haplotype1=substr(target_alleles,1,1) || substr(target_alleles,2,1);

else if substr(haplotype,1,1)='2' then

haplotype1= substr(target_alleles,2,1) || substr(target_alleles,2,1);

end;

else

haplotype1= ' ';

if length(haplotype) >= 2 then

do;

if substr(haplotype,2,1)='0' then

haplotype2= substr(target_alleles,3,1) || substr(target_alleles,3,1);

else if substr(haplotype,2,1)='1' then

haplotype2= substr(target_alleles,3,1) || substr(target_alleles,4,1);

else if substr(haplotype,2,1)='2' then

haplotype2= substr(target_alleles,4,1) || substr(target_alleles,4,1);

end;

else

haplotype2= ' ';

if length(haplotype) >= 3 then

do;

if substr(haplotype,3,1)='0' then

haplotype3= substr(target_alleles,5,1) || substr(target_alleles,5,1);

else if substr(haplotype,3,1)='1' then

haplotype3= substr(target_alleles,5,1) || substr(target_alleles,6,1);

else if substr(haplotype,3,1)='2' then

haplotype3=substr(target_alleles,6,1) || substr(target_alleles,6,1);

end;

else

haplotype3= ' ';

new_haplotype=haplotype1 || haplotype2 || haplotype3;

run;

----------------------------------------- CONFIDENTIALITY NOTICE: This electronic message contains information which may be legally confidential and/or privileged and does not in any case represent a firm ENERGY COMMODITY bid or offer relating thereto which binds the sender without an additional express written confirmation to that effect. The information is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.


Back to: Top of message | Previous page | Main SAS-L page