{"id":127306,"date":"2013-08-14T13:05:24","date_gmt":"2013-08-14T16:05:24","guid":{"rendered":"http:\/\/revistapesquisa.fapesp.br\/?p=127306"},"modified":"2013-08-14T15:37:28","modified_gmt":"2013-08-14T18:37:28","slug":"more-bits-in-the-service-of-dna-2","status":"publish","type":"post","link":"https:\/\/revistapesquisa.fapesp.br\/en\/more-bits-in-the-service-of-dna-2\/","title":{"rendered":"More bits in the service of DNA"},"content":{"rendered":"<p><em>Published in February 2013<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-127321\" alt=\"\" src=\"http:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/Abre_Fapesp_OK1.jpg\" width=\"290\" height=\"194\" \/><span class=\"media-credits-inline\">FABIO OTUBO<\/span>Little more than a decade ago, only a few complete genomes were available for analysis. Today, there are not enough programs or personnel to track the number of DNA sequences that are deposited in public databases and produced every day by the new generation of sequencers. These extremely fast machines identify the base pairs, or chemical letters, of the genetic material at a cost that is thousands of times lower than was possible in the early 2000s, when the epic journey of sequencing the first human genome came to an end. Eyeing that challenge, mathematician Jo\u00e3o Meidanis, a founding partner of the company Scylla Bioinform\u00e1tica and professor at the University of Campinas (Unicamp) in Brazil, invested in a line of research to create simpler, more efficient methods of comparing two or more genomes.<\/p>\n<p>Working with his former doctoral student Pedro Feij\u00e3o in 2009, he formulated the theoretical basis for a method of comparing entire genomes, known as the Single-Cut-or-Join (SCJ) operation, and last year, he tested it in practice on the genomes of organisms, including plants and bacteria. \u201cWith our method, we can easily compare two or more genomes without exponentially increasing the number of calculations we make, which is what happens with other techniques,\u201d Meidanis says. \u201cWe can use it to construct genealogical trees and see which genomes are closest or farthest from an evolutionary standpoint.\u201d The mathematician was one of the bioinformatics coordinators for the project that, in 2000, sequenced the genome of the bacteria Xylella fastidiosa, which causes citrus variegated chlorosis in orange trees. The work resulted in the first cover story of the scientific journal Nature devoted to a Brazilian research study.<\/p>\n<p>To compare all of the genetic material of one species to that of another, researchers must resort to simplification. The primary way to do this is to take into account the notion that the genes in the compared genomes are exactly the same but are in a different order in the specific sequence of each organism. Using this logic, methods for comparing genomes count the number of rearrangements that would be needed to transform one genome into another. These rearrangements occur when large segments of DNA in the original sequence move over time. The fewer rearrangements separating two genomes, the closer they are to one another on the evolutionary tree.<\/p>\n<p>Using their method, Meidanis and Feij\u00e3o formulated an alternative definition for the concept of the breakpoint, an important parameter for finding rearrangements in a sequence and calculating the proximity of two genomes. A breakpoint is the location at which there is an interruption in a long conserved segment in the genomes being compared.<\/p>\n<p>Last year, the two researchers refined another method of genome comparison that is more elaborate than SCJ. This second technique, initially proposed in 2000, compares only circular genomes. With this development, it also became useful to compare the genetic material of linear chromosomes. \u201cThat was one of the limitations of the original technique,\u201d says Feij\u00e3o, who is now with Scylla. The new method, based on what mathematicians call adjacency algebraic formalism, has not yet been tested on real genomes. For now, it exists only in theory.<\/p>\n<p><strong>Metagenomics<\/strong><br \/>\n<a href=\"http:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/005-009_CAPA_ING.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-medium wp-image-127326\" alt=\"\" src=\"http:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/005-009_CAPA_ING-300x163.jpg\" width=\"300\" height=\"163\" srcset=\"https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/005-009_CAPA_ING-300x163.jpg 300w, https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/005-009_CAPA_ING-810x441.jpg 810w, https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/005-009_CAPA_ING-1024x557.jpg 1024w, https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2013\/08\/005-009_CAPA_ING.jpg 1469w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><span class=\"media-credits-inline\">SOURCE\u2002NHGRI Genome Sequencing Program<\/span><\/a>Meidanis is clearly not the only researcher to feel the effects of the new reality in his field. Having returned to Brazil in mid-2011, after eight years at the Virginia Bioinformatics Institute in the US, Jo\u00e3o Carlos Setubal, who now is a full professor at the Chemistry Institute of the University of S\u00e3o Paulo (USP), notes that the demand for services and research in his field has grown in volume and sophistication in the recent years. As an example of this trend, he has received 16 proposals for collaboration on other researchers\u2019 initiatives since returning to S\u00e3o Paulo. \u201cThe latest-generation sequencers produce an astronomical amount of data on genomes, proteomes and organism metabolism,\u201d says Setubal, who was a bioinformatics coordinator for the Xylella project. \u201cBecause of declining technology costs, today any research project with minimal resources can sequence the genome of an organism.\u201d<\/p>\n<p>One field that has opened up to biologists and bioinformaticians in the past decade is metagenomics, which studies the microbiota of an ecological niche. Setubal\u2019s principal research, an FAPESP thematic project on microorganisms at the S\u00e3o Paulo Zoo, is focused on this field. In this approach, instead of isolating and cultivating the microorganisms such that the DNA of each species can be extracted separately, he takes a sample directly from the environment to be studied. In such a sample, the DNA of several species comes \u201cmixed,\u201d and it is up to the bioinformaticians to find the techniques for separating and characterizing the genetic material of each species. \u201cWe are studying three microbiomes at the Zoo: compost made by zoo staff, water from the lakes and manure from howler monkeys,\u201d Setubal says.<\/p>\n<p>Metagenomics is also a way to search for unknown organisms in a specific habitat. The team headed by Ana Tereza Ribeiro de Vasconcelos, the coordinator of the bioinformatics center at the National Scientific Computing Laboratory (LNCC) in the city of Petropolis, participated in the discovery of magnetic bacteria found in Araruama Lake on the coast of the state of Rio de Janeiro, one of the world\u2019s most saline lagoons. One of the bacteria found in that study was Candidatus magnetoglobus multicellularis; identified by Ulysses Lins of the Federal University of Rio de Janeiro (UFRJ), the bacteria are difficult to isolate from the environment and keep in a culture medium. \u201cWe are currently involved in about ten metagenomics projects,\u201d says Vasconcelos, who has three sequencers in her laboratory and a team of approximately 25 people.<\/p>\n<p>The amount of time and money required for projects devoted to DNA analysis of organisms has changed radically in the past decade. In the early years of the genomics era, only large companies dared to venture into this new field. By the time the international public consortium that sequenced human genome for the first time was officially terminated in April 2003, that mega-initiative required 13 years of work by hundreds of scientists from at least 18 countries, including Brazil, and had an estimated cost of $2.7 billion. In considerably lower but equally massive proportion, the sequencing of the Xylella bacterium had cost FAPESP $12 million and involved the contributions of 192 researchers over a three-year period.<\/p>\n<p>Genome sequencing has become cheaper by a factor of 10,000 to 20,000 compared to what it cost a little more than a decade ago, according to the data from the National Human Genome Research Institute (NHGRI) in the U.S. The mass influx of the second-generation sequencers into the market in the early 2008, which used a technology different from that of the early Sanger-type machines, caused the cost of sequencing to plummet at a rate that far outpaced the performance gains resulting from Moore\u2019s Law of computing power. Today, in two or three days and at a cost of just a few thousand dollars, it is possible to identify all of the three billion chemical letters of a person\u2019s DNA. \u201cBioinformatics is a new tool, a magnifying glass, that enables us to better understand this biological phenomenon, which has not changed but can now be seen in another way,\u201d says Gon\u00e7alo Pereira of the Institute of Biology (IB) at Unicamp.<\/p>\n<p>However, sequencing is one thing, and extracting useful information from the billions of data points that computers pour out on a daily basis into the hands of scientists is another thing and is considerably more complex. \u201cGenome sequencing is cheap today and has become a commodity, but data analysis is expensive,\u201d says computer scientist Jo\u00e3o Paulo Kitajima of Mendelics, a new company that provides personalized genome analysis. \u201cThe number of people searching for jobs in bioinformatics has grown exponentially, and there is a supply and demand gap for specialists in Brazil and elsewhere.\u201d<\/p>\n<p>It is difficult to accurately estimate the size of the community of bioinformaticians in Brazil. According to Guillherme Oliveira, president of the Brazilian Association for Bioinformatics and Computational Biology (AB3C), there are approximately 300 people, including professors, students and researchers, who maintain ties with the organization. \u201cBioinformaticians were once self-taught,\u201d says Oliveira, who coordinates the bioinformatics center at the Oswaldo Cruz Foundation (Fiocruz) in the state of Minas Gerais. \u201cToday, many of them have come out of post-graduate programs, and every state has a bioinformatics specialist. What\u2019s new is that now companies are also operating in this field.\u201d Large Brazilian universities, such as USP, UFRJ and the Federal University of Minas Gerais (UFMG), as well as Fiocruz, have specific post-graduate programs in bioinformatics. Other universities incorporate it as a line of research in their post-doctoral programs in broader fields, such as biology or computing.<\/p>\n<p>The work of sequencing and analyzing the genome of Schistosoma mansoni, the parasite that causes schistosomiasis, has been the focus of the highest profile project at the bioinformatics center of the Fiocruz facility in Minas Gerais in recent years. However, the six sequencing machines and 15 specialists in the bioinformatics unit headed by Oliveira have participated in approximately 60 different projects, including studies of the genomes of cancer, infectious agents, bovine breeds, and metagenomics studies. The center now also generates and analyzes data for the Research Network for the Molecular Identification of Brazilian Biodiversity (BR-BoL), coordinated by Cl\u00e1udio Oliveira of the Institute of Biosciences at the Universidade Estadual Paulista (Unesp) in Botucatu, which plans to catalogue 120,000 specimens from 24,000 species in nature within four years. BR-BoL is the Brazilian arm of the International Barcode of Life Project, whose goal is to identify species by characterizing their DNA.<\/p>\n<p>Bioinformatics has spread throughout Brazil, even to centers far from its major cities in the Southeast. At the Federal University of Par\u00e1 (UFPA), Artur Silva conducts bioinformatics research in collaboration with groups from S\u00e3o Paulo. Since May of 2012, Sandro de Souza, who for years headed research in this field at the Ludwig Institute for Cancer Research in S\u00e3o Paulo, is now at the Brain Institute at the Federal University of Rio Grande do Norte (UFRN). He does not have a sequencer at his own facility in Natal, the state capital, but he appears unconcerned. \u201cDon\u2019t forget that I can even do sequencing in the cloud if I want,\u201d Souza says. \u201cI\u2019m beginning my neuroscience research without any problem.\u201d<\/p>\n<p>However, Souza also has access to all of the machines from the Ludwig Institute, which closed its facility in the city of S\u00e3o Paulo and moved them to the Ribeir\u00e3o Preto School of Medicine at USP (FMRP-USP), where the Center for Medical Genomics opened just last year. \u201cThe techniques used in genomics and bioinformatics will create a revolution in medical practice similar to what happened with image-based medicine,\u201d says Wilson Ara\u00fajo da Silva J\u00fanior, one of the people in charge of the new center at FMRP.<\/p>\n<p>To increase access to the DNA and RNA sequencing and analytical services, Unicamp is opening the Central Laboratory for High-Performance Technologies (LacTAD) on March 1. The laboratory will focus on genomics, proteomics, cell biology and bioinformatics. Its equipment includes two new-generation sequencers made by Illumina that are capable of decoding a complete human DNA sequence in a matter of a few days and a third machine to sequence specific genomic regions. The machines at the center have been in use since last year, when they arrived at the university, but were scattered around in different locations. Next month they will begin operating in the 2,000-m2 facility built for LacTAD.<\/p>\n<p>\u201cWe believe there is an unmet demand for this type of service, and bioinformatics has become a bottleneck for many biological research studies,\u201d says Ronaldo Pilli, a chemist and Provost for Research at Unicamp, who heads the project in the new laboratory. \u201cWe are joining the worldwide trend towards offering this type of service on a centralized basis, which makes it easier to purchase, operate, and update the machines.\u201d LacTAD\u2019s equipment was purchased for about R$5.5 million through FAPESP\u2019s Multiuser Equipment Program. The building, budgeted at R$4 million, was financed by the university.<\/p>\n<p>LacTAD will provide services to researchers at Unicamp and to other universities and businesses. Interested researchers can obtain price quotes for using the services at the laboratory website. \u201cThe cost of the work we do will range from R$100 to R$100,000,\u201d Pilli says. Democratization has come to the world of bioinformatics.<\/p>\n<p><em><strong>China has the largest sequencing center<\/strong><\/em><br \/>\nIn less than 15 years, a Chinese bioinformatics center has gone \u2028from being a minor partner in the international consortium that mapped the first human genome to a major global power in DNA sequencing. Established in 1999, the Beijing Genomics Institute (BGI) today has 180 sequencing machines, most \u2028of which are latest generation units that can produce six terabytes \u2028of data per day, an equivalent \u2028of the complete genomes of 2,000 individuals. The center has 4,000 employees and affiliates in the United States, Europe and Japan. \u2028This operation, conducted by \u2028the Chinese on an enormous scale, has created expectations that the \u2028cost of sequencing of a human genome will soon fall to $1,000. \u2028Their work makes them a major \u2028player in the state-of-the-art \u2028projects that reach far beyond decoding the genetic sequence \u2028of the giant panda, the national symbol, three years ago.<\/p>\n<p>In 2010, for example, BGI sequenced the first complete genome of a human ancestor from the DNA of an Eskimo who lived 4,000 years ago. In 2012, it provided the DNA of 100 Chinese for the international effort to study the genome of approximately 1,000 people from different regions around the globe. In addition, last year, \u2028the center announced plans \u2028to sequence three million genomes \u2028of humans, plants, animals and microorganisms in the next few years.<\/p>\n<p>The Chinese policy is an aggressive one both scientifically and commercially. Beyond selling its bioinformatics services, BGI is trying to ensure its own access to the most recent advances in the field. \u2028Early this year, the center received the go-ahead from the U.S. \u2028for the $177 million purchase \u2028of Complete Genomics, a California company that developed a new sequencing method. The results obtained using this method have\u2028 been reported to be more \u2028accurate than those obtained with \u2028the current methods used \u2028worldwide.<\/p>\n<p><strong>Projects<br \/>\n<\/strong><b>1.<\/b> Studies on the microbial diversity of the S\u00e3o Paulo zoo \u2013 No. 11\/50870-6; <b>Grant mechanism <\/b>BIOTA Program \u2013 Thematic project; <b>Coordinators <\/b>Jo\u00e3o Setubal (USP); <b>Investments <\/b>R$1,711,698.25 (FAPESP);<br \/>\n<b>2<\/b>. EMU: Central Laboratory for High-Performance Technologies \u2013 No. 09\/54129-9; <b>Grant mechanism<\/b> Multi-user Equipment Program; <b>Coordinators <\/b>Fernando Ferreira Costa\u00a0 (Unicamp); <b>Investments <\/b>R$6,034,431.00 (FAPESP).<\/p>\n","protected":false},"excerpt":{"rendered":"More bits in the service of DNA ","protected":false},"author":13,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[156],"tags":[],"coauthors":[101],"class_list":["post-127306","post","type-post","status-publish","format-standard","hentry","category-cover"],"acf":[],"_links":{"self":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/127306","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/comments?post=127306"}],"version-history":[{"count":0,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/127306\/revisions"}],"wp:attachment":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/media?parent=127306"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/categories?post=127306"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/tags?post=127306"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/coauthors?post=127306"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}