Imprimir Republish

Open science

A laptop in hand, a project in mind

Virtual hackathon explores ways to share data and use technology to improve the quality of research

Liyao Xie / Getty images and Constantine Johnny / Getty images

To what extent is research possible without any funding at all? A group of 30 students and researchers from across Brazil and in different fields held a two-week, online hackathon in August to experiment with methods of doing science without a budget or committed funding. The “No-Budget Science Hack Week”, as the event was called, explored methods of producing significant results from low-cost science. One solution participants discussed is the use of publicly available open-access data for research—an abundance of material is available to be explored in economic and demographic databases and in repositories of primary data from research in different fields. Another solution is using freeware available on the internet to organize and analyze large bodies of information and extract original findings or undetected trends. “When you’re doing science without a budget, all you have is ‘a laptop in hand, and an idea in mind,’” says the hackathon’s organizer, Olavo Amaral, a professor at the Federal University of Rio de Janeiro’s (UFRJ) Institute of Medical Biochemistry.

But despite its name, the hackathon was not intended to promote research models requiring no funding, although this can be useful when money is in short supply. Participants recognize that public and private investment in science remains crucial for the advancement of knowledge and social development. “Our goal was to encourage new research practices and develop technology tools for better science,” explains Amaral, who in recent years has headed efforts to perfect processes in scientific research. Two years ago he created the Brazilian Reproducibility Initiative, a project funded by the Serrapilheira Institute that will repeat up to 100 biomedical experiments published in Brazilian scientific articles to determine whether the results are reproducible (see Pesquisa FAPESP issue no. 267). Being able to confirm the findings of a study allows subsequent research to build on a verifiable foundation of accumulated knowledge, while also building trust in the scientific process and its results.

Many of the practices discussed in the workshop fall within the definition of open science, a movement that encourages researchers to share data and collaborate in networks to solve problems that a researcher or laboratory acting alone would be unable to solve. Another concept explored in the hackathon was metascience, or the use of scientific methodology to increase the efficiency of scientific research while reducing waste.

This was the second edition of the hackathon, and the first using a virtual format, because of the pandemic. The funds that would otherwise have been spent on hospitality were returned to the event sponsors: the University of São Paulo (USP) and Serrapilheira. Around 50 participants registered for the event and presented their research interests, of which 30 were selected by a six-person organizing committee. During the first week, participants discussed proposals and selected the ones they agreed were most promising and had the greatest potential to deliver results within the workshop’s short timeframe. With help from specialists, participants then explored ways to implement selected proposals. The second week was devoted to implementation.

Each participant agreed to dedicate at least three hours per day to project tasks. One team worked on building a database of science outreach projects in Brazil, and another created a program to scan the bibliographic references of scientific papers for pseudo-open data—links to the original sources of information that are broken when clicked due to errors or because they have not been updated. Another group compiled documentation on Brazilian graduate programs in physiology from the Brazilian Federal Agency for Support and Evaluation of Graduate Education (CAPES), and evaluated them against the Hong Kong Principles for assessing researchers, a set of recommendations designed to enhance transparency, diversity, and information-sharing. They found, for example, that 91.3% of programs offer master’s and/or doctoral students courses on good practices in research, such as the scientific method, biostatistics, and scientific writing, although in most cases these courses are not mandatory.

Preprint servers—repositories where manuscripts can be published immediately, prior to peer review in scientific journals—were the theme of a project that explored Brazilian authors’ use of this method of publishing research, while another project used word clouds and semantic networks to analyze the content of preprints on COVID-19 between January and August. In most projects, the teams produced only preliminary findings that the researchers will need to develop further—most said they plan to continue their projects to completion. Overall, the workshop fulfilled its purpose of encouraging participants to learn about open-science practices and promote them in their work environment.

Vanderson Martins do Rosario, who is pursuing doctoral research on computer science at the University of Campinas (UNICAMP), learned about the hack-week event through a Twitter post, and registered immediately. “Metascience relies to a great extent on data analytics, and requires the skills of a computer scientist,” says Rosario. He worked with two teams during the workshop. One was the team tasked with analyzing the profiles of Brazilian authors publishing to preprint servers. Although the project proved to be too complex to complete during the two-week hackathon, the group was able to produce important findings—at least in the field of life sciences—by intersecting data from the medRxiv and bioRxiv preprint servers with data from Brazil’s Lattes curriculum platform. “We found that the use of these platforms by Brazilian researchers has grown exponentially,” says Rosario. The career stages of Brazilian authors using preprint servers were compared against a control group of scientists not using these platforms. “Younger generations tend to publish preprints much more often than their more senior peers,” he notes.

The second project Rosario worked on compiled a catalog of science outreach programs in Brazil. “I realized that a service of this kind could be useful when I was searching for podcasts about science and found that search results mixed science podcasts with programs on astrology,” he says. The challenge was to compile information about blogs, podcasts, and YouTube channels about science, and then create a platform to catalog them. “We had to determine what data needed to be collected so the search engine would operate effectively. This included information on the type of media being used, the field, and the name of the author.” The next step was a manual process of cataloging each project. “Serrapilheira provided records on the most recent science outreach projects for which it had provided grants, and we also used other sources information. This included Twitter and YouTube lists and a list of blogs prepared by UNICAMP. We also created bots to compile data.” The result was a specially designed search tool for science outreach projects, which the team plans to launch in the near future. “I was happy to have provided the idea for the project, and that it proved helpful in compiling useful information.”

The hackathon was also an opportunity for researchers to network across generations. With a career of nearly three decades as a physical therapy specialist, Ligia de Loiola Cisneros, a professor at the School of Physical Education, Physical Therapy, and Occupational Therapy at the Federal University of Minas Gerais (UFMG), heard about the workshop from one of her students and was eager to attend an online session. “Olavo Amaral delivered a very inspiring presentation that gave me new ideas about methods I can use,” she says. Her interest in open science developed out of a recent experience. Cisneros is currently on a post-doctoral internship in a program on biomedical instrumentation and rehabilitation at São Paulo State University (UNESP), and has used artificial intelligence tools to build a neural network with data on patients with diabetic foot ulcers, a complication that can lead to amputation. Artificial neural networks are machine-learning computer models inspired by biological neural networks. The tool can predict progression and outcomes for patients seeking medical attention. “A wealth of information can be gleaned from patient records at referral hospitals and then used at other healthcare providers to benefit future patients,” she explains.

Economist and epidemiologist Alexandre Chiavegatto Filho, from the School of Public Health at USP, delivered a presentation about machine learning during the workshop series that ran concurrently with the hackathon. After attending his presentation, Cisneros decided to invite him to join her on a new research project. “I discussed a project proposal I planned to submit to FAPEMIG [Minas Gerais State Research Foundation], and he agreed to join us,” she says. During the hackathon, she was a member of the project team that conducted a semantic analysis of preprints about COVID-19. “We were able to show how the content of the articles evolved over time. They were first largely focused on Wuhan, China, then branched out to include Europe and the US, and finally became concerned with identifying the virus and developing diagnostic tests.”

Oldimar Cardoso, a researcher in the fields of history and education, proposed a project to conduct a semantic analysis of the 40,000 textbooks purchased by the Brazilian Ministry of Education since 1998. Cardoso, who holds a doctoral degree from the School of Education at USP and is an expert in digital learning platforms, plans to do postdoctoral research on the subject in the near future. “There are relatively few metascience initiatives in the humanities and I saw the hackathon as an opportunity to attract other researchers to my field of interest,” he says. His proposal, however, failed to make the list of initiatives that would be implemented during the hack week. But Cardoso was undiscouraged—he attended sessions that analyzed the documents of post-graduate committees, and was a member of the team that reviewed preprints on COVID-19. Overall, the hackathon proved to be a worthwhile experience. “My knowledge was useful for the project teams, and I will be able to apply the concepts I learned in my semantic analysis of education textbooks.”

Pharmaceutical engineer Gabriel Lovate had never used an open-science approach in previous research. As a master’s student in biochemistry at USP’s Ribeirão Preto School of Medicine (FMRP), under a FAPESP fellowship, he studies genes that create resistance to antibiotics. Researchers at FMRP-USP were mobilizing to collaborate virtually during the pandemic in studies on the novel coronavirus, and he became interested in learning about the practices that would be presented in the workshop. After his suggested project to expand access to computational biology platforms at Brazilian universities failed to make the selection for the hackathon, he joined the project on pseudo-open data. “This creates a major constraint on open science. Internet links to scientific papers are often broken. So while the data appear to be publicly available, they in fact are not,” he explains. His group analyzed a set of biological science journals available in the SciELO open access library. They used data scraping—a technique in which a computer program extracts data from webpages—to select available links. They then used an automated method to check whether the links were working. “Some journals had multiple broken links, but we will still need to validate our methods,” he explains. The team plans to develop a browser extension that, when active, will flag any broken links to scientific articles.