I am preparing a proposal to the NSF/USDA Microbial Genome Sequencing Program to support
improvements to genomic resources for Tetrahymena thermophila. The three main goals will be
closure, structural annotation, and functional annotation. These goals will benefit from the
results of other projects now in progress, including micronuclear genome sequencing and
comparative genome sequencing of other Tetrahymena species.
I am writing to request that you contribute a letter of support to this effort. The proposal due
date is March 3. Strong community support is vital to obtaining funding. We need to make it clear
that, although the Tetrahymena genome has been sequenced and partially annotated, there is
much room for improvement in the resources available and your research and student training
programs would benefit from those improvements. As a model organism, Tetrahymena deserves
a little of the same treatment that other model organisms receive; the yeast and fly genomes, for
example, have been thoroughly reannotated and improved since the time of their initial
sequencing. It is clear that improvements to Tetrahymena annotation will have great value in the
effort to more accurately annotate other ciliate species as well, so letters from anyone in the ciliate
community (and beyond) will be very welcome. Please pass this letter on to colleagues you think
may be interested. Please send letters to:
[log in to unmask]
J. Craig Venter Institute
9704 Medical Center Dr.
Rockville, MD 20850
Below are some more detail on the goals.
1. Closure. Although we have a good assembly of the macronuclear genome sequence, there are
still hundreds of gaps and low quality regions. There are undoubtedly genes or pieces of genes in
these gaps. Also, to relate the micronuclear sequence to that of the MAC and locate all sites of
rearrangement, we need a complete and accurate MAC assembly, free of any gaps or MIC
contamination. A completely closed MAC genome will also greatly assist genetic mapping and
provide a scaffold for comparative genomics with other species, including determining synteny
relationships. All indications are that, because of the low repetitiveness of the MAC genome, it
may be possible to close all 180 chromosomes from telomere to telomere. This would be a rare
accomplishment for any eukaryote. We plan to use "next generation" 454 sequencing to achieve
~20X coverage of the genome. This will close most gaps. Any remaining ones will be closed
using standard methods that were successful with the first round of closure.
2. Structural Annotation. Many improvements have been made to the initial ab initio gene model
predictions, but most genes have no direct supporting evidence for their overall structure. We will
use 454 technology to sequence normalized cDNA to much higher coverage than previously done.
This evidence, and the resulting improved training of gene finding algorithms, will greatly improve
the gene models and possibly identify sites of alternative splicing.
3. Functional Annotation. As many of you know, the functional information available about most
gene models is very inadequate and has not been kept up to date. We will use a variety of
computational and hand-curation tools to identify homologs, conserved domains, EC numbers,
GO terms, etc. for genes and gene families. We also plan to work with members of the community
to improve functional annotation. This will include training undergraduates in gene curation,
under the supervision of professors with an interest in ciliate genomics.
Thanks very much for your help in supporting this effort.