You are here : Home > Research Centers and Units > Genoscope > Projects > Plant gneomics


Plant Genomics

Theobrama cacao, Deciphering the cocoa genome

A major step forward for understanding the species’ biology and its improvement

An international consortium (ICGS: International Cocoa Genome Sequencing Consortium), coordinated by a team from CIRAD in Montpellier (France), is publishing an article inNature Genetics on the sequencing and the first detailed analyses of the cocoa genome Theobroma cacao , the plant that chocolate comes from.

Published on 14 September 2018
Theobrama cacao, Le génome du cacaoyer décrypté

This research work is the result of international collaboration involving 60 scientists from six different countries : 

  • France : several teams from CIRAD , CEA/GENOSCOPE , several teams from INRA(Clermont Ferrand, Toulouse Evry, CEA/CNG), CNRS , the Universities of Perpignan andEvry ,
  • the USA : the Universities of Penn State and Arizona , as well as the Cold Spring Harbor Laboratory ,
  • Trinidad and Tobago : the CRU of the University of the West Indies ,
  • Brazil : ​​CEPLAC,
  • en Côte d'Ivoire : CNRA,
  • au Venezuela : IDEA.

Cirad, C.Lanaud

​The genome that was sequenced comes from the cocoa variety Criollo collected in Belize. It could be a descendent of the first cocoa trees domesticated by the Mayas over 2000 years ago. The chocolate derived from this variety is of the highest quality, classified as one of the fine flavour chocolates. The first results from these analyses will help further understanding of the genes that may be involved in the aromatic traits of chocolate or in cocoa’s disease resistance mechanisms. They also make it possible to retrace the cocoa tree’s evolutionary history. ​

​This research will drastically increase our understanding of the cocoa tree and its improvement. This will benefit small-scale producers in developing countries, making it possible to create new productive and disease-resistant varieties more effectively, at the same time as maintaining the excellent aromatic qualities of their chocolate. 
The detailed results will be published in the online edition of the review Nature Genetics on 26 December 2010.

Cocoa, the first long-generation tropical fruit tree to be sequenced

Cocoa is a fruit tree that has particular economic importance for the humid tropical countries where it is cultivated. After fermentation, drying and roasting, the beans are used to make chocolate. Cocoa is often cultivated under a forest canopy and in small plantations, which also means that it helps maintain biodiversity and respect the environment.

Cocoa is the first long-generation tropical fruit tree to have been sequenced. Access to its sequence now opens up a wide field of studies, allowing the genes responsible for its natural genetic variation to be characterized more quickly, not only for T. cacao adaptation to environmental conditions and disease resistance, but also for the aromatic qualities of chocolate.

Access to these genes will make it possible to find out how their expression varies within the genetic resources or under the effect of environmental factors. This will facilitate the creation of productive disease-resistant varieties, with high quality cocoa, which will contribute to the development of sustainable cocoa production by reducing the amount of pesticide use.

 Sequencing Criollo, a fine cocoa variety

A Criollo variety was chosen for sequencing. It was collected in old plantations in Belize and comes from successive generations of self-pollination that have occurred naturally during its cultivation. A combination of several sequencing techniques, implemented by Genoscope, the University of Penn State and Cold Spring Harbor Laboratory, made it possible for Genoscope to produce a high quality assembled sequence. After several teams from INRA and CIRAD had annotated the genes, it was possible to highlight the existence of 28 798 genes coding for proteins, of which 2053 seemed to be unique to cocoa compared to several other genomes from sequenced plants. Ninety-eight percent of the genes expressed in this plant are found in the assembled sequence.

 Towards a better understanding of the qualities of chocolate

The qualities of chocolate are derived from a complex process, involving several groups of biochemical compounds. Among these, polyphenols have an important role and are also described as being beneficial for human health and for protecting the cardio-vascular system. Cocoa beans have a high rate of polyphenols (proanthocyanidin), which can constitute up to 8% of the dry weight of cocoa beans, making this species one of the richest sources of these phytonutrients. Ninety-six genes involved in the biosynthesis of these compounds have been identified in the cocoa genome sequence. The cocoa sequence analysis revealed that one of the gene families, having a key role in the biosynthesis of some proanthocyanin precursors, is over-represented in the cocoa tree compared to other species.

The analysis of genes involved in the biosynthesis of cocoa butter (which constitutes about 50% of the dry weight of beans), and terpenes (compounds from which numerous aromatic flavours are derived), also revealed the extension of special gene families that could play a key role in giving chocolate its well-known technological and aromatic properties. For example, this is the case for the gene that synthesizes linalool and which is represented by seven copies in the Criollo cocoa genome. Linalool is one of the major components of the aromas of other aromatic plants and is present in the composition of many essential oils.

Giving cocoa sustainable resistance

 Fungal diseases have a major impact on cocoa production and are responsible for almost 30% of harvest losses globally. One of the primary selection objectives for all the cocoa breeding programmes is research of varieties gathering several sources of resistance of different origin, and capable of giving cocoa sustainable resistance to diseases. The detailed analysis of two of the most important families of genes for resistance known in the plant world (NBS-LRR and LRR-RLK) was conducted using this first cocoa genome sequence. It revealed 296 and 253 genes belonging to these two gene groups, respectively. All of these genes were located on the genome and compared to the chromosome regions already identified as carriers of sources of resistance. "Candidate" genes, potentially involved in cocoa's resistance mechanisms for Moniliophthora perniciosa or black pod rot caused byPhytophthora have now been identified. They will soon be studied in more detail so that their involvement in this resistance can be validated.

Diagnostic tools available to cocoa breeders

With this reference sequence, it is possible to search for genetic markers (portion of the DNA sequence, that can be variable depending on the cocoa varieties) in each of the genome's regions of interest. They can also be used with "marker assisted selection" to direct and control the accumulation of the regions carrying the favourable genes in new varieties (in order to accumulate several sources of resistance genes, for example). The markers targeted in the genes of interest can become good "diagnostic" markers for screening collections of genetic resources carrying genes of interest that can be used in breeding programmes.

 A new view of cocoa's evolution and paleohistory

The analysis to compare the cocoa genome with that of other sequenced plant species, such as grape, poplar, arabidopsis, soybean and papaya, revealed that the cocoa genome, like the grape, was very close to that of the ancestral species from which, over the course of evolution, all the dicotyledonous plant species are thought to be derived. A scenario of the evolution of cocoa from this putative ancestor was proposed: it involves 11 major fusions from ancestral chromosomes before arriving at the 10 basic chromosomes in cocoa today. Given the constitution of its genome and the ease with which it reproduces, cocoa represents a simple new model for studying the process of evolution, the function of genes, genetics and biochemistry for fruit trees.

The huge amount of information generated by this project may radically change the status of this tropical plant and its potential interest for the entire scientific community. This situation could encourage new investments in research on Theobroma cacao , the food of the gods, whose magical flavour has spread throughout the world since the Maya and Aztec civilizations, and whose continued study will benefit developing countries for which cocoa is of high economic importance.

 Reference :  Nature, 27th December 2010 :