Print

Print


Dear Tetrahymena Researchers,

Many people have asked us for suggestions about what they should do
when publishing on a gene whose gene model prediction does not match
the sequence they have deduced themselves, either by cloning a new
cDNA or by comparison with ESTs and genes from related species.  The
revision of the gene models, genome wide, is an on going task at
TIGR, and the updated models will replace those now in use when they
are complete.  The gene models at any time are just the current
working hypothesis of the gene structure within the genome.  Any
experimental evidence, and also any results from a detailed analysis
using other information such as cross-species comparisons, should be
published and communicated to TIGR and TGD to enhance the models.
The next update will fix many of the issues people have reported
because of refinements in the gene calling parameters and in the
evidence that is available to validate the resulting models.

To guarantee that readers can find the exact sequence for any genes
you publish on, we suggest submitting any cDNA sequences that have
cloned to GenBank.  You are then able to identify the cDNA sequence
as the source for any analyses performed in your paper.  The cDNA
sequence is very important for the assignment of gene models as this
provides strong evidence for a particular gene model and will be used
in the gene finding process at TIGR.  It is also a good idea to
provide the gene model name as a way to identify the location of the
gene on the chromosome, even if you disagree with the model the gene
model name will allow readers to quickly find the correct region of
the genome.

If you have deduced your sequence from some combination of EST data,
cloned cDNAs, cross-species comparisons, TIGR's gene models, and the
published genome sequence, it can be difficult to track and refer to
the sequence in a paper.  We suggest that you provide details about
the data you used to assemble your sequence.  Some authors list the
ways their sequence differs from the current gene model predictions
in Genbank.  However you decide to document the assembly of your
sequence in the text, it is important to include the exact sequence
of the CDS or protein you use in any analyses, either as a figure in
the body of your paper or in the supplemental materials.  Both cDNAs
and published documentation like this will be used to update the gene
models in the future.

TIGR and TGD have tools to help you identify and navigate the gene
models.  At TGD direct links are provided to researchers' original
Genbank entries when they exist, both from a gene's Locus page and
from the genome browser.  A graphical alignment of the relevant
sequences in Genbank (including ESTs) with the predicted model is
shown in TGD's genome browser (GBrowse), to assist in making
comparisons.  The "Download Decorated FASTA File" option in the
genome browser can be configured to shade differences between the
gene models and ESTs, right in a FASTA file.  If you'd like to know
more about using these tools or have questions about a particular
gene model, email us at [log in to unmask] and we'll
be happy to help.

Information on the gene models can also be found at the TIGR web site.

Sincerely,

Nick Stover, Cindy Krieger, Mike Cherry and Jonathan Eisen