Imprimir Republish

genomic

Fertile deserts

Method identifies functional sequences of DNA in the midst of wildernesses devoid of genes

Marcelo de Aguiar Nóbrega is a son of Recife, the capital of the state of Pernambuco, who adopted California to live, but he keeps a distance from the waves and the beaches. His passion are the deserts – of genes, you understand. After three years of prospecting in that one third of the human genome in whose sands there was not believed to be not even a shadow of genes, and which for this reason was unfairly glossed as junk DNA, the researcher from the Lawrence Berkeley National Laboratory, in the United States, begins to reap high quality results: not only are the genomic dunes full of oases, in the form of DNA sequences with as yet little known functions, but the Brazilian is in the vanguard of the methods for mapping them.

The findings have already been yielding for the 32-year old native of the state of Pernambuco articles and reports in high impact publications, like Science and Nature. The first came in the form of a short report in theScience of October 17 last year, under the title” Scanning the human genome deserts in search of long distance enhancers”. The next manuscript has already been accepted by a first class magazine, but is still without a date defined for publication. For the time being, the new results have only been presented at scientific meetings, like the 50th Brazilian Genetics Congress, held in Florianopolis from September 7 to 10. Between one and the other, the research of Edward Rubin’s laboratory, where Nóbrega works, has been the object of news in Science itself, in June, and in Nature Reviews Genetics, in December of last year.

The group’s discoveries are as intriguing as the term “enhancers”. Enhancers are sequences of DNA that modulate the expression of genes in the cell, favoring the production of the proteins specified by them. It used to be supposed until now that this genre of regulatory element always stayed in the vicinity of the genes properly speaking, that is of their modules said to be encoding (the exons, which specify the type and order of the amino acids that are going to make up the corresponding protein), or at least of the non-encoding ones (introns). The work of comparative genomics of the team of Rubin and Nóbrega has shown, though, that these modulators may lie, paradoxically, hundreds of thousands of base pairs from what they modulate – in the midst of the gene deserts.

“To use a metaphor, imagine that the regulatory elements of a gene are equivalent to a light switch”, Nóbrega explained during the course he gave at the congress in Florianópolis, which packed out the Flores room at the Costão do Santinho resort. “You would usually think that the light switch for your bedroom is in the bedroom, wouldn’t you? What we have now shown is that it is possible for this switch to be in the garage of a house in the neighboring block.”Nóbrega studied medicine at the Federal University of Pernambuco (UFPE), which he entered when when he was 16, after learning to read and write at the age of 4.

“My mother put the children into school early to get rid of us”, he says. In a family in which they are all engineers, even his twin brother, he was the only one to venture into medicine and research. He began as a trainee in a laboratory at UFPE, researching physiology (hypertension) for three years, with Ana Maria Cabral, and even before graduating went to the laboratory of Allen Cowley at the University of Wisconsin, in Milwaukee, in the United States, since he wanted “to be exposed to even more serious research”. To do so, he interrupted his formal study for six months, in the middle, but afterwards went back to Recife to conclude his medicine course. In 1995, at the age of 23, he was already accepted for a doctorate at Wisconsin.

With the Human Genome Project close to conclusion, he quit hypertension and swung over to study molecular biology with Howard Jacob, which Wisconsin had just taken away from Harvard for his weight in gold. Having finished the doctorate, he looked for a laboratory that was working with genes and diseases, and went to knock on the door of Eddy Rubin. In spite of a certain incompatibility in lifestyle – the boss is crazy for surfing and speaks to his students with an eye on the Californian on-line service about the swell of the waves -, Nóbrega accepted on the spot a mission that seemed to be fated to failure, three years ago: starting a line of comparative genomics to scrutinize the deserts.

When he talks of gene deserts, Nóbrega refers to those 25% to 30% of the human genome that make up intergenic regions with over 500,000 base pairs (or kb, the standard measure of genomics). These intergenic regions had been described as deserts and included with several kinds of repetitive sequences in junk DNA, so-called because it does not participate directly in what used to be considered the par excellence function of DNA, specifying amino acids.

The value of junk DNA stood out even more when the alignment of the sequences of the genomes of man and mice revealed, between the gene deserts, hundreds of stretches with at least 200 identical base pairs, or almost so. This work of mass searches for ultraconserved non-encoding sequences was published on May 28 this year in Science, by David Haussler, from the University of California at Santa Cruz (UCSC), United States, but one of the first had been discovered by Rubin’s team.

It is now known that about 1,250,000 evolutionary conserved regions with an extension of at least a hundred base pairs and a 70% coincidence between the genomes of mice and men, 80% of them in non-encoding regions (non-exons) and 58% in the intergenic space (non-exon and non-intron). The fact that these sequences of DNA are to be found so conserved after 70 million years of evolutionary divergence between primates and rodents makes one suppose that they have, rather, some function. The problem was to discover which one. It was this mission that Nóbrega accepted from Rubin, when he moved from Wisconsin to California.

One of the specialties of Rubin’s laboratory at the Lawrence Berkeley is the breeding of genetically modified mice, and Nóbrega made capital out of this. There, 500 rodents’ ovules were injected with DNA a day, which redounds in a daily average of from 15 to 20 animals with a successful genetic modification. The group chose nine desert sequences to test, in order to discover whether they had some unknown function. They were all present in arid gene regions that flanked an important gene in embryonic development, the DACH, which has 430 kb in extension and, as neighbors, deserts of 1,330 kb and 830 kb. The test properly speaking consisted of concocting mice with one copy of each of these sequences, coupled to a gene of a bacterium that, if activated, makes the whole cell that activated it tinted with blue.

Examining the fetuses of the transgenic rodents, Nóbrega observed that seven of the nine sequences chosen took part in some way in embryonic development, since several organs and structures – like parts of the nervous system, eyes and spinal cord, precisely tissues in which the DACH gene is usually active – turned blue. According to the researcher from Pernambuco, the lessons from this work – published in Science in October 2003 – were two: “One that gene deserts do not contain genes, but they may contain sequences that are critical for gene functions in the neighborhood; and, two, that gene regulatory elements may be at incredible distances from them, much larger than used to be imagined before”.

Seven desert areas with a proven function, though, are almost nothing in the light of the over 1 million ultraconserved sequences between humans and mice. Setting off from the presupposition that the majority perhaps did not have an important function, that is, it was indeed some kind of trash, the group went off in search of a filter capable of discriminating those stretches with a greater probability of performing some function. The next step was taken with the help of a vertebrate relative even more distant from man, a fish, more precisely a kind of puffer fish venerated in Japanese cuisine, Fugu rubripes, from which we were separated 450 million years ago. Its genome has just been published. Instead of sharing over 1 million sequences, with the puffer fish, we share only 40,000 conserved regions, 56% in exons and only 36% in intergenic spaces.

For Nóbrega, this means that these sequences are probably involved in all that is most basic on the bodily plane of all vertebrates, regardless of whether they are terrestrial, winged or aquatic. If they are altered by a mutation, they probably make the individual with the mutation inviable. This brings about the immediate removal of that variant from the genetic pool, thus allowing the conservation of those crucial stretches of DNA, even outside genes, for almost half a billion years. In short, they were in possession of a probable excellent guide for discarding conserved deserts of lesser interest, suggested in profusion by the simple man/mouse comparison. But before that it had to be proved that sequences maintained for 70 million years were in fact less essential than those of 450 million years.

These were the experiments presented by Nóbrega in his course at the Brazilian Genetics Congress, a month ago, the results of which should be published shortly in a first class magazine. A total of 15 desert sequences conserved from mice to men, but not from the puffer fish, were the object of tests with mice, and only one of them revealed a function – a much lower proportion than that of the 7 of the total of 9 neighbors to the DACH gene. “The other 14 stretches?” asks the Brazilian. “We have no idea what they are and what activity they have, if indeed they have an activity.”

The group from the Lawrence Berkeley decided to go further, since the possibility was not excluded that the absence of activity of the sequences chosen was an artifact of experimental design, that is of the kind of function that the method employed with the DACH gene had a capacity for revealing. That is why Nóbrega resorted to what he calls the brute force test: purely and simply to remove some of these sequences from the genome of mice, to see what would happen. Brute force indeed: they tore out straight away two entire deserts, one with 1.5 million base pairs, the other with 800,000, just to see the damage and so to discover possible functions. Nothing happened.

“We carried out the greatest deletions ever carried out in mice”, said Nóbrega in his course. “Surprisingly, not only were the mice with the deletions we did viable, they also grew up, reproduced, and did not develop any apparent abnormality. That is, it seems that what we tore out is in fact the equivalent of a least important piece of the genome, or at least, less critical, than others.” Not even any kind of chromosomal instability was observed, although in one of the cases they had deleted almost 10% of the animal’s chromosome.

It was not for nothing that the work, even before being published, called the attention of Science. In the article published on June 11, geneticist Jim Hudson, from Open Biosystems, a company from Alabama, in the United States, declares his incredulity: “Knocking out two megabases and not getting an effect is notable”. Arend Sidow, from Stanford University, declared himself more skeptical still: “It cannot be true”. They both stated to the American magazine that the deleted regions probably fulfill functions that the tests were not capable of detecting.

Nóbrega does not discard this hypothesis, but says that the result is not as absurd as all that, in the light of recent discoveries about the occurrence in normal human beings of spontaneous deletions and enormous additions of DNA, also, in the order of from 100,000 to 2 million bases. “It all begins to fit in and make sense”, he says. The works were published by Jonathan Sebat and collaborators in the Science of July 23, and by John Iafrate and colleagues in last month’s Nature Genetics. Apparently, these alterations are associated with the phenotypic variation between individuals, that is, they represent a hitherto unsuspected source – one more – of what the geneticists call polymorphisms.

“A new race is beginning to see who manages – in the most complete, rapid and cheap way – to scan the largest possible number of genomes in search of these large scale variations”, Nóbrega sums up. “It was my good luck to have been thinking about this for some time, and so I started to race before the others”. The Brazilian researcher says that his next target is to combine comparative genomics with genetic engineering, to design a map of natural variations in the architecture of the genome, by means of large scale deletions, and of their impact on the biology and physiology of the organisms. “The whole genome is now a target for looking for functional sequences.”

This work is already bearing surprising fruits for an understanding of the complexity of the genome. First, it is discovered that that there are several oases in the midst of the gene desert. Afterwards, that not everything that looks like an oasis actually is an oasis, which means to say, that some deserts are so desert that they can produce mirages – when the researcher limits himself to comparing human chromosomes with those of mice. The brute force that Nóbrega employed with the rodents has at least already allowed him to extract from them a partial answer to the question posed by his boss in Science: could the genome be a disposable soap opera, from which a hundred pages can be torn out without any problem, or is it rather like a work by Ernest Hemingway, in which the whole plot falls apart if a single page is lost? Answer in the Brazilian manner: “It does seem that the genome is a Globo TV soap opera, and not a Da Vinci Code”.

Republish