International Tomato Sequencing Project Overview


The tomato genome is comprised of approximately 950 Mb of DNA - more than 75% of which is heterochromatin and largely devoid of genes. The majority of genes are found in long contiguous stretches of gene-dense euchromatin located on the distal portions of each chromosome arm. A minimal tiling path of BAC clones will be identified through this approximately 220 Mb euchromatin. The starting point for sequencing the genome will be approximately 1500 "seed" BAC clones individually anchored to the tomato high density genetic map based on a single, common L. esculentum x L. pennellii F2 population (referred to as the F2.2000; view map on SGN). Sequencing will proceed on a BAC-by-BAC basis. Each sequenced anchor BAC will serve as a seed from which to radiate out into the minimum tiling path. Identification of the correct next BACs in the euchromatin minimum tiling path for sequencing will be based on the use of a BAC end sequence database that will be created as part of this project, as well as a fingerprint contig physical map that is currently being constructed. A subset of the sequenced BACs will be localized on pachytene chromosomes via FISH (fluorescence in situ hybridization) to help guide the extension of the tiling path through the euchromatic arms of each chromosome and to determine when the heterochromatin and telomeric regions have been reached on each arm. A bioinformatics portal will be created for this project that will be mirrored at several locations around the world and provide a mechanism by which researchers in different locations can develop and contribute bioinformatics tools and information to the project. A common set of standards for BAC sequencing and finishing, and for gene nomenclature, and structural and functional gene annotation (please refer to the Solanaceae Project page).

The objectives of the tomato sequencing project are to:
  1. produce a contiguous sequence of the gene rich, euchromatic arms of each of the 12 tomato chromosomes
  2. process and annotate this sequence in a manner consistent and compatible with similar data from Arabidopsis, rice and other plant species.
  3. create an international bioinformatics portal for comparative Solanaceae genomics which can store, process, and make available to the public the sequence data and derived information from this project and associated genomics activities in other solanaceous plants
More Project Documents 
Participants and Funding 
Est. Total MBases
Est. Total BACs
Chr.CountryPeopleGrant AgencyTarget DeadlineEst. Euchromatin Size (Mb)Est. # BACs
1USAJ. Giovannoni
B. Roe
University of Oklahoma
J. Van Eck
Boyce Thompson Institute
L. Mueller
Boyce Thompson Institute
S. Stack
Colorado State U.
National Science FoundationJan. 200924246
2KoreaD. Choi
B.D. Kim
Natl. U.
BioGreen21 Project / RDA
Frontier 21 Project / CFCG
Ministry of Science and Technology (MOST)
Feb. 2004, July 200426268
3ChinaC. Li
Chinese Acad. Sci.
Y. Xue
Chinese Acad. Sci.
Z. Cheng
Chinese Acad. Sci.
M. Chen
Chinese Acad. Sci.
H. Ling
Chinese Acad. Sci.
Chinese Academy of Science
Natural Science Foundation
Mar. 200426274
4UKG. Bishop
Imperial College
G. Seymour
Nottingham University
G. Bryan
5IndiaR.P. Sharma
U. Hyderabad
J. Khurana,
U. Hyderabad
A. Tyagi
U. Hyderabad
N.K. Singh
National Research Centre
on Plant Biotech., IARI
DBT, Govt. of India-11111
6The NetherlandsW. Stiekema
Centre for
Biosystems Genomics
P. Lindhout
Wageningen U.
T. Jesse
R. Klein Lankhorst
Wageningen U.
Fundedin progress20213
7FranceM. Bouzayen
National Agency for Genome SequencingMar. 200427277
8JapanD. Shibata
Kazusa Inst.
S. Tabata
Kazusa Inst.
Chiba PrefectureSep. 200417175
9SpainA. Granell
Inst. de Biologia Molecular
y Cellular de Plantas Valencia
M. Botella
U. Malaga
Submitted to Genoma EspanaPending16164
10USA(see above)    
11ChinaS. Huang
Chinese Academy of Sciences
12ItalyG. Giuliano
L. Fruciante
U. Naples
Italian Ministry of Agriculture and Italian Ministry of ResearchMay 200411113