Genetic Science and Digital Action : Revista Pesquisa Fapesp

João Carlos Setubal

During the 40s the modern digital computer was invented. It was called ?digital? because its operation was based on the binary notation in which the information is stored and manipulated using only zeros or ones. A computer is a creation of the human mind, an incarnation of mathematics, the most abstract of sciences and consequently, something very distant from the world of biology. What a surprise therefore it was when it was discovered in the 50s that genetic information is also basically digital! (Being that the ?biological notation? has four symbols instead of only two.) Future generations will perhaps find the small temporal distance that separates the invention of the digital computer from the discovery of the DNA double helix extraordinary.

Genetic information seemed inaccessible to us. We had known that it was there, we had understood its structure, but did not have efficient methods to read it. This changed during the 90s, when modern DNA sequencing machines started the reading of vast quantities of this type of information. During these forty years in which the sequencing was slow to take off, computers and the computing science also had their own phenomenal progress, as it is well known.

As a result, as soon as the sequencing machines went on to pour out innumerable chains of A?s, C?s, G?s and T?s, we could seize upon a powerful arsenal of computers and computer techniques, mathematics and statistics in order to mount, analyze and attempt to understand genetic information. This activity was named bioinformatics and is one of the most recent and promising branches of modern science. Just like the proverbial snake that swallows its own tail, the human mind projected in the silicon went on to devour the primary substance of its very own origin. The advent of bioinformatics is provoking a ?mathematization / informatization? of molecular biology, which is more and more turning itself into a quantitative/informative science.

The analysis of the DNA chains is only the beginning. The next step, already in progress, includes an essential quantitative/computerized understanding of the processes that occur in a cell. This, in spite of the microscope, is an extremely complex system, and an understanding of what happens in it, in a deep and satisfactory manner, is still going to demand many decades. Bioinformatics has a rich and long future ahead of it.

If on the one hand the ?mathematization? of molecular biology allows us to better understand this fundamental phenomenon of our planet that is life, on the other it reinforces the dependency more and more that we have on computers and of their manipulators, the informaticians. This must be emphasized in particular in the case of clinical applications of molecular biology derived from the genomic revolution. The day has already arrived in which certain results from a ?clinical analysis laboratory? do not come (only) from a laboratory experiment with test tubes and chemical reactions, but come (as well) from a computer program. This means that all of the ?good? aspects (speed, capacity to process large quantities of data, etc.) and the ?bad? aspects (errors of diagnosis caused by defective software, depreciation of human judgment, etc.) are coming to these applied areas that make use of computers.

It is up to society to organize itself so that the ?bad? aspects will be firmly under control, in such a manner that the wedding of the double helix with the computer will be at the service of humanity, and not the other way around.

João Carlos Setubal was the bioinformatics coordinator for the Xylella fastidiosa Genome Project, and is an associate professor at the Computer Institute of Unicamp and coordinator of the Bioinformatics Laboratory of the same institute. This article is a summary of the author?s lecture presented at the round table “Fifty Years of the Double Helix of DNA” given at the Livraria Cultura in the Villa-Lobos Shopping Center on the 10^th of April.

Republish