{"id":262862,"date":"2018-09-11T15:03:07","date_gmt":"2018-09-11T18:03:07","guid":{"rendered":"http:\/\/revistapesquisa.fapesp.br\/?p=262862"},"modified":"2018-11-14T17:45:56","modified_gmt":"2018-11-14T19:45:56","slug":"a-strategy-for-research-data","status":"publish","type":"post","link":"https:\/\/revistapesquisa.fapesp.br\/en\/a-strategy-for-research-data\/","title":{"rendered":"A strategy for research data"},"content":{"rendered":"<p>Managing and storing large volumes of research data is a challenge faced by scientists in every field. In the last decade, research funding agencies such as the National Science Foundation (NSF) in the US and the Economic and Social Research Council in the UK have increasingly required grant applicants to submit data management plans outlining how research data will be managed, preserved, and made available in public repositories. Their aim is to ensure that information is shared, research data is reusable, and experiments are reproducible, facilitating further scientific discoveries and optimizing returns on funding investment.<\/p>\n<p>Although data management planning is not currently mandatory in Brazil, last October FAPESP took a step in this direction and announced that grants for \u201cthematic projects\u201d\u2014projects lasting five years and characterized by ambitious objectives\u2014will be required to contain a data management plan as a supplement. The requirement will be gradually extended to other grant mechanisms later in the year. \u201cThis is among the first initiatives in Brazil to establish policies and guidelines for managing scientific data,\u201d says Claudia Bauzer Medeiros, a professor at the Computing Department at the University of Campinas (UNICAMP) and head of FAPESP&#8217;s eScience program.<\/p>\n<p>The Foundation&#8217;s <em>Code of Good Practice<\/em>, launched in 2011, already required researchers to submit records from their research. \u201cThey will now be required to specify how their data will be managed\u2014from collection to storage\u2014and how and when the data will be made available,\u201d she says. UNICAMP was the first university in Brazil to create a data-management plan template on the <a href=\"http:\/\/dmptool.org\" target=\"_blank\" rel=\"noopener\">DMPTool website<\/a>. The initiative, led by Benilton de S\u00e1 Carvalho of the Institute of Mathematics, Statistics, and Scientific Computing (IMECC), allows researchers from his university to easily create their plans online and make them available worldwide. More than 200 research institutions in different countries have officially adopted DMPTool for creating and sharing data-management plans. Currently only three Brazilian universities are on the platform: UNICAMP, the University of S\u00e3o Paulo (USP) and the Federal University of ABC (UFABC).<\/p>\n<p><a href=\"http:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2018\/09\/036-039_Gestao-de-Dados_267.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"2400\" height=\"1840\" class=\"aligncenter size-full wp-image-263113\" src=\"http:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2018\/09\/036-039_Gestao-de-Dados_267.jpg\" alt=\"\" srcset=\"https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2018\/09\/036-039_Gestao-de-Dados_267.jpg 2400w, https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2018\/09\/036-039_Gestao-de-Dados_267-250x192.jpg 250w, https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2018\/09\/036-039_Gestao-de-Dados_267-700x537.jpg 700w, https:\/\/revistapesquisa.fapesp.br\/wp-content\/uploads\/2018\/09\/036-039_Gestao-de-Dados_267-120x92.jpg 120w\" sizes=\"auto, (max-width: 2400px) 100vw, 2400px\" \/><\/a><\/p>\n<p>Making experiment or field data widely available can lead to collaborations and accelerate scientific breakthroughs by increasing the visibility of research outputs. In 2016, an international consortium involving more than 30 organizations, including the Oswaldo Cruz Foundation, the Chinese Academy of Sciences, and the National Institutes of Health (NIH), in the US, encouraged researchers to share the data they collected during the recent Zika virus outbreak. As a result, in a matter of months they were able to publish research showing the link between Zika and microcephaly. In the field of \u200b\u200bbiodiversity, storing research data in public repositories makes millions of records on plant and animal species widely accessible, facilitating further research. The speciesLink network, one of the digital biodiversity databases developed in Brazil, allows researchers to find information about the occurrence and distribution of species of microorganisms, algae, fungi, plants, and animals. The platform has compiled records from 470 collections in Brazil and other countries. These collections contain about 9 million records on 125,000 species, including 2,756 threatened species.<\/p>\n<p>But data management planning is more than just about placing data in an online database. According to the Digital Curation Centre, a UK center of expertise in digital curation, a data management plan should contain information on how and why the data has been created and stored. This means that information needs to be provided on how metadata\u2014or data describing other data\u2014will be organized. \u201cMetadata are descriptions of datasets, detailing how, when, and where they were produced, how they can be reused, and who created them,\u201d explains information scientist M\u00e1rcia Teixeira Cavalcanti, a professor at Universidade Santa \u00darsula, Rio de Janeiro, and a member of the Information, Heritage, and Society research group at the Brazilian Institute of Information in Science and Technology (IBICT). \u201cIt\u2019s about identifying and standardizing scientific data so they can be easily accessed in repository searches and reused in other research,\u201d she says.<\/p>\n<blockquote><p>Researchers are being required to specify how their data will be managed, from collection to preservation<\/p><\/blockquote>\n<p>In 2016, Cavalcanti was involved in data curation on the <a href=\"http:\/\/carpedien.ien.gov.br\" target=\"_blank\" rel=\"noopener\">CarpeDIEN platform<\/a> at Brazil\u2019s Nuclear Energy Institute (IEN), which performs research in fields such as radiopharmaceuticals and artificial intelligence. \u201cIt took time to develop the right metadata models for the kind of information we were dealing with,\u201d she says. According to Cavalcanti, the curation process should begin before any data is produced. \u201cIn a data management plan, it may also be important to specify the software or equipment that will be used to generate information such as images or algorithms.\u201d Claudia Medeiros agrees that this type of information can be essential. \u201cOften having access to the data is not enough to reproduce an experiment. You also need to have the same computer programs or operating system to recreate the same conditions as in the original study,\u201d she says.<\/p>\n<p>During her time at IEN, M\u00e1rcia Cavalcanti conducted a survey on data repositories in Europe which she published last year in a journal of the Institute for Humanities and Information Sciences at the Federal University of Rio Grande (FURG), in Rio Grande do Sul. The survey covered 33 countries and found that only nine had open-access research repositories in 2016. Her findings show that data-sharing is still incipient in many European countries. Horizon 2020, the largest research funding scheme in the European Union, established in 2007, issued a step-by-step guide on data-management plans in 2016, ahead of them becoming mandatory for all grant applications in 2017. One important aspect of the guide is the attention it draws to situations in which sharing raw data can create ethical issues, such as in clinical trials that use personal data and need to protect patient privacy.<\/p>\n<blockquote><p>Publicly funded researchers cannot omit themselves from sharing information, says C\u00e2mara<\/p><\/blockquote>\n<p>\u201cBarring these exceptions, there are really no arguments to justify publicly funded researchers in refusing to furnish their data,\u201d says Gilberto C\u00e2mara, a researcher at the Brazilian National Institute for Space Research (INPE) and a coordinator of the FAPESP Research Program on Global Climate Change. According to C\u00e2mara, many researchers will hold off archiving experiment data until their research has been published in a journal, on the argument that their data could be appropriated by others and published without them receiving credit for it. \u201cThat&#8217;s a poor excuse,\u201d he says. C\u00e2mara explains that information can be safely archived before publishing a paper as all data are assigned an identification code known as a Digital Object Identifier (DOI) so they are traceable. \u201cThe fact is that, unfortunately, many researchers don\u2019t want others to publish research before they, who collected the data, have published their work,\u201d says C\u00e2mara.<\/p>\n<p>\u201cAll the data from my research are archived in open databases as they are collected,\u201d says the researcher, who publishes data from satellite-image analysis on Pangaea, a platform for georeferenced data. Recently, information he stored in this digital repository was used by researchers from Restore+, an international consortium for land-use research based in Germany. C\u00e2mara welcomes FAPESP&#8217;s initiative to require researchers to develop data-management plans. \u201cThis can help to address bad habits in the scientific community by promoting good practices in data management,\u201d he says. \u201cThere are researchers who feel they own the data and will only share it with their colleagues if they get something in return, such as coauthorship of the paper. This, unfortunately, is all too frequent,\u201d he says.<\/p>\n","protected":false},"excerpt":{"rendered":"Researchers are being encouraged to better manage and share the data they produce","protected":false},"author":421,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[166],"tags":[226,228],"coauthors":[740],"class_list":["post-262862","post","type-post","status-publish","format-standard","hentry","category-policies-st-en","tag-education","tag-engineering"],"acf":[],"_links":{"self":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/262862","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/users\/421"}],"replies":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/comments?post=262862"}],"version-history":[{"count":2,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/262862\/revisions"}],"predecessor-version":[{"id":264234,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/262862\/revisions\/264234"}],"wp:attachment":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/media?parent=262862"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/categories?post=262862"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/tags?post=262862"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/coauthors?post=262862"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}