The most common insect in the world’s laboratories, the drosophila, or fruit fly – Drosophila melanogaster, commonly seen around ripe bananas – has just gone through a rereading. In October last year, a team from the School of Medicine at the University of São Paulo (USP), at Ribeirão Preto, completed a piece of research that discovered 91 new genes for the insect and generated sequences that help define the transcription template – the set of an organism’s ribonucleic acids (RNAs) forming its genetic code. As a result, the way is paved for more accurate analyses of how genes act and interrelate in the drosophila.
Studied for around 90 years, the insect, measuring at most half a centimeter, has been used in the attempt to understand human diseases and malfunctions, and it lies at the base of genetic evolution. It was adopted as a laboratory model because it has large chromosomes, easily visible under the microscope, and it reproduces relatively rapidly – from ten to twelve days.
The research by the USP team was inspired by the publication of the first version of the drosophila’s genome sequence – its set of genes – with the results of the reading of about 120 million base pairs of the deoxyribonucleic acid molecule, DNA, the carrier of the genetic code in each cell. Announced in March 2000, the sequencing of the drosophila’s genome was the outcome of a mega-operation, involving the Berkeley Drosophila Genome Project (BDGP), at the University of California, and the US company Celera Genomics. It is expected that the final gene sequence, with 180 million bases, will be released this year.
Nonetheless, in complex organisms, reading the genome sequence alone does not enable correct identification of all the genes. In these organisms, parts of the DNA that constitute the gene are withdrawn during the formation of the messenger gene – the molecule that carries the information to be translated into the formation of the protein. Hence, there is the need to obtain the sequences present in the RNA. To do this, the Californian group BDGP uses the ESTs (Expressed Sequence Tags) technique, which leads to readings of the ends of the messenger RNA.
The Brazilian team of 15 researchers, however – by suggestion from the scientific board of FAPESP, which financed the research – employed a new technique in studies of this type: Open Reading Expressed Sequence Tags, known by the acronym Orestes. Created in Brazil and used in the Human Cancer Genome research, also financed by FAPESP in partnership with Ludwig Institute of São Paulo, this technique was developed by researchers at this institute, who passed it on to USP at Ribeirão Preto. The Orestes technique differs from the classic method of studying the drosophila because it does not give priority to reading the ends, but to the central parts of the RNA sequences – where the genetic code information that is translated into proteins tends to be concentrated.
Maria Luísa Paçó-Larson, coordinator of the USP study, points out another difference: the Brazilian method is a valuable path to detecting the not-very-abundant messenger RNA molecules, which are, therefore, difficult to clone using conventional techniques. Thus, the research proved the validity of the technique: “We observed that the Orestes method can produce new information on expressed sequences and it supplements what is obtained by conventional methods”, says Maria Luísa.
In record time, from April to September, the team obtained and analyzed 10,092 readings of Orestes sequences of the drosophila. “With the help of the researchers at the Ludwig Institute and the Ribeirão Preto Hemocenter, our team was responsible for executing the entire process: extracting the RNA, generating profiles, cloning, producing, and analyzing the sequences”, says the researchers.
The team’s analyses validated 91 new genes. Based on the similarity to the proteins of other organisms, half of these genes were written down as codifiers of proteins with various functions – regulatory, enzymatic, and structural, for example.
Other 133 sequences were also identified, deriving from regions not classified as genes based on the analysis of the genome sequence.
“We detected uncharacterized gene fragments as well as sequences for which there is no EST, absolutely new material in terms of the expressed sequencing of the drosophila”, says Maria Luísa. In other words: the way was paved for carrying out more accurate analyses of the role of these new genes in the organism.
There are around 90,000 Drosophila ESTs, according to the BDGP reports. “These ESTs helped identify around 40% of the genes predicted by the genome sequencing analysis released by the BDGP in the report of March 2000”, she says. To identify the remaining 60%, the BDGP began a new project to produce 200,000 ESTs at a rate of 4,000 a month.
In Maria Luísa’s opinion, it is possible to speed this up without impairing quality. “Based on the data obtained in this pilot experiment, we believe that the Orestes method can supplement the data produced by other projects”, she says. Based on the experience acquired in the Human Cancer Genome project, she calculates that about 10,000 Orestes a month can be done with a capillary sequencer – a sort of latest model, luxury car in laboratory sequencers.
Regardless of approach used, the importance of sequencing the drosophila genome is clear: by analogy, it enables better understanding of the human genome. Of the 280 human genes associated with diseases or deformations, 177 have already been detected in the insect’s chromosomes. They are the so-called orthologous genes – that have similarities that may be functional, and accordingly enable a more accurate approach to genetic phenomena.
There are other open questions. For a century students of the fruit fly have sought to understand how an organism with a relatively limited genome – only four pairs of chromosomes – can trigger subtle and varied responses to the environment. In the 50s, the geneticist Crodowaldo Pavan discovered local and seasonal variations in insects in 35 places in 17 Brazilian regions – there was a marked variation in the concentration of individuals, even in places close to each other. As a general rule, the environment may change little, but the drosophilae vary a great deal – one of the reasons why they became a standard study in evolutionary biology. Reading the genome provides valuable tools for these studies.
In October, the BDGP released the second version of the genome sequence, with new sequences and filling in 330 of the gaps left behind. And, according to the more recent report by the Berkeley group, the final version of the genome sequence will be available by mid-year. Maria Luísa believes that, with this material, they will be in a position to produce a more correct list of the expected genes.
Thus, the spotlights are on the fruit fly’s genetic material, and expectations rise. “When we have all the RNAs, we will be able to produce filters with all the sequences and examine them in a deformed drosophila to try to understand the process that made it be defective”, exemplifies Ricardo Gelerman Pinheiro Ramos, one of the biologists on the project undertaken in Ribeirão Preto.
The importance of completing the sequencing goes well beyond genetics. The English researcher Jonathan Hodgkin pointed out recently in the magazine Nature that nowadays the world has a consistent database of genomic data, “something to take your breath away”, with sufficient data to keep biologists busy for decades: “Nothing like it has happened before in the history of science, nor is it likely to happen again”.
At the beginning of November, when the team from Ribeirão Preto was putting the finishing touches to the article with the results of their study, researchers at Yale University analyzed the possibility of the text being published in an international scientific magazine. Maria Luísa says that the Brazilian sequences will be deposited in the biggest database open to the public; the one maintained by the National Cancer Institute (NCBI) in the United States.
While waiting for replies, Maria Luísa is drafting another project to obtain and analyze around 200,000 expressed sequences of the drosophila genome, equivalent to 20 times the volume of material handled so far. “We don’t know whether one day we will fully understand the genome of an organism as complex as the drosophila. But, given its importance as an experimental model, any progress in this sense is worthwhile”.Republish