A research project that has been under development for over a decade has made an important contribution to the understanding of sugarcane genetics. An international group coordinated by researchers from Brazil has partially sequenced the genome of the most important commercial variety of this plant in the country, cultivar SP80-3280, and has found 373,869 genes. This figure is 14 times higher than the number of genes found in July 2018 by a French group, who studied a variety planted on islands in the Indian Ocean and the Caribbean, and 10 times higher than that determined by a Chinese team, also last year, of the species Saccharum spontaneum—a wild, undomesticated sugarcane species.
The study, to be published in the scientific journal GigaScience, also determined the potential regulatory regions that control the functioning of genes. “Our work was the most comprehensive because we sequenced the sugarcane genome in its entirety, as opposed to just portions of it, as previous studies have done,” states biochemist Glaucia Souza, from the University of São Paulo Institute of Chemistry (IQ-USP), one of the team leaders and a coordinator of the FAPESP Bioenergy Research Program (BIOEN). The article will be a basis for studies that aim to improve biomass production for energy and food through the genetic improvement of the plant.
Sugarcane cultivars are technically called polyploid hybrids. Their genetic material comes from more than one species and has several copies of its 10 basic chromosomes. This peculiarity leads their genome to boast about 10 billion base pairs, the chemical unit that makes up DNA—more than three times the amount found in Homo sapiens. “Human beings have two copies of each chromosome, one inherited from the father and one from the mother. Commercial sugarcane usually has 6 to 12 copies of each chromosome,” explains biologist Marie-Anne Van Sluys, from the Institute of Biosciences at the University of São Paulo (IB-USP), also a leader of the group.
The study also determined that only 12.5% of the genome of cultivar SP80-3280 came from wild sugarcane, known for its natural robustness, while about 85% came from Saccharum officinarum, a species that humans began to cultivate a few thousand years ago. A small percentage of its DNA is a result of the recombination of the genetic material from the two parents.
Since 2008, Souza and Van Sluys have worked with the research group, which includes colleagues from the United States, China, and South Korea. Microsoft Research, in Redmond, Washington, was also part of the sequencing work. From a technical point of view, one of the group’s major advances was developing reading and assembling methods for the long strands of DNA into which the gigantic sugarcane genome had to be sliced in order to be sequenced.
The Microsoft team created algorithms and used their computational structure to perform this complex task. “We were able to overcome several obstacles, all related to the manipulation of a large volume of data,” explains Bob Davidson, a genome software specialist at Microsoft, who was part of the study.
Although the study mapped the entire genome of cultivar SP80-3280, only 30% of the sequences obtained, about 3 billion base pairs, were assembled according to the order in which they appear on their chromosomes. This portion of the material is the most important, as it houses the genes of the plant, which provide instructions for the production of its proteins. Assembling a third of the genome may seem like a small feat, given the overall size and complexity of the genetic material, but it is much further than other research groups have gone. In the sequencing carried out by the Center for International Cooperation in Agronomic Research for Development (CIRAD), French scientists worked with just a single copy of an undetermined chromosome. That is why they only found 25,000 genes.
In addition to having several copies of its chromosomes, the sugarcane genome poses an extra challenge in its assembly: each chromosome has several mobile DNA fragments that are repeated inside—the so-called transposition elements. “These elements look very similar. That is why we still haven’t been able to get the genes aligned within the chromosomes,” Van Sluys explains. The greatest challenge now is organizing all the sequences identified along each chromosome of the Brazilian cultivar.
Even though it is not the final version of the sugarcane genome, the new sequencing should be useful to studies on the improvement of the varieties planted in Brazil. For example, the researchers found that there are important differences related to the regulatory sequences of the sugarcane genes. These distinctions can cause a plant to adapt differently when exposed to environmental stressors, such as excess salinity, heat, and drought.
As commercial sugarcane cultivars distribute approximately one-third of their carbon in sucrose, it is important to study the metabolism of sugar production and the main agents that regulate it. One of the results presented in the published paper focuses precisely on the synthesis of this type of carbohydrate. Cultivar SP80-3280 has particular regulatory elements involved in the production of sucrose that were never found in its ancestor S. spontaneum. The other two-thirds of the carbon in the cultivated sugarcane goes to structures such as the stem and cell walls in general. Due to their high lignin content—a molecule responsible for rigidity—these hard parts can be burned in boilers as fuel.
In this context, the scientific discoveries indicate that the genetic sequences of the Brazilian cultivar that regulate the carbon partition process are within the networks of genes defined during plant growth and maturation. “It is fundamental to understand all of these processes that involve carbon, sugar, and fibers when considering genetic improvement,” Souza adds.
According to the study authors, the global yield of sugarcane, 84 tons per hectare, represents only 20% of the plant’s potential, estimated at 381 tons per hectare. This projection encourages an international race to develop both conventional and biotechnology-based sugarcane improvement strategies. With the more traditional approach, productivity gains have been small, from 1% to 1.5% per year.
This modest advance drives the demand for new technologies. In this context, this genomic work means a step forward in the dispute, which is unfolding on commercial grounds. “In addition to France and China, the United States is also now looking to assemble a complete sugarcane genome, but it has not yet been published,” says Souza.
1. Sugarcane signaling and regulatory networks (nº 08/52146-0); Grant Mechanism Thematic Project; Bioenergy Research Program (BIOEN); FAPEMIG Agreement; Principal Investigator Glaucia Souza (USP); Investment R$4,318,073.60.
2. Sugarcane genome sequence: Plant transposable elements are active contributors to gene structure variation, regulation and function (nº 08/52074-0); Grant Mechanism Thematic Project; BIOEN Program; Principal Investigator Marie-Anne Van Sluys (USP); Investment R$4,190,155.40.
Souza, G.M. & Van Sluys, M.A. et al. Assembly of the 373K gene space of the polyploid sugarcane genome reveals reservoirs of functional diversity in the world’s leading biomass crop. GigaScience. In press.