Public Review

The preprints model, commonly used in the exact sciences, is gaining popularity in the biological and social sciences

The Wellcome Trust, a British charity that funds biomedical research, announced on January 17 that it will begin accepting preprints as references in the studies it supports. Preprints are articles that have not yet passed the scrutiny of peer review (the consecrated form of evaluation for scientific journals), instead being made publicly available immediately on electronic repositories in order to expose their results to early critique from the scientific community. “We hope that by citing preprints, researchers will be encouraged to share results and findings more quickly,” explains Robert Kiley, director of digital services at the Wellcome Trust. On March 24, the National Institutes of Health (NIH), the main organization fostering medical research in the United States, followed suit, allowing their candidates to reference preprints in projects submitted to the agency. Scientific journals can take months or even years to properly review and publish an article. With a preprint, the manuscript is made immediately available for reading and critique.

The Wellcome Trust and NIH initiatives are part of a movement that is gaining momentum with the growing use of preprints in fields such as the life sciences and social sciences, following the example of the exact sciences. The arXiv repository, currently hosted by Cornell University in the United States, has been used by physicists, mathematicians, and computer scientists for 26 years, and continues to inspire new initiatives. In February 2016, 30 scientific organizations and institutions around the world, including the Chinese Academy of Science, the Bill and Melinda Gates Foundation, the NIH, and the Wellcome Trust, joined an agreement and appealed for all data collected during the Zika virus outbreak to be made available quickly and openly. The decision follows the precepts of a September 2015 World Health Organization statement calling for the rapid dissemination of data during health emergencies, shortening the distance between the authorities, the public, and scientific information. The measure proved effective and in March, studies offering strong evidence of the relationship between Zika and microcephaly were published in preprints.

One such study was shared on the PeerJ Preprints repository by a team led by neuroscientist Stevens Rehen, from the Federal University of Rio de Janeiro (UFRJ) and the D’Or Institute of Research and Teaching (IDOR). The group found that Zika invades and kills neural stem cells (see Pesquisa FAPESP, issue nº 242). The preliminary text was highly referenced and received more than 13,000 views, says Rehen, who then submitted the article to Science magazine. “Preprints do not prevent the subsequent publication of an article in a journal, and they allow the author to receive critiques from his peers, which can help in refining the manuscript before it is sent to a respected journal,” explains the researcher. The version published in Science magazine in April 2016 contains modifications requested by the reviewers of the journal: “They suggested that we include data on dengue, to offer a comparison with the Zika virus.”

Science magazine, like other journals, does not publish articles whose results have already been published in other magazines, in order to guarantee that their content is original. But it does admit to publishing good articles already shared on repositories such as bioRxiv (for biological science papers) and arXiv. However, some journals still do not accept articles that have previously been shared as preprints. This is the case for the Journal of Clinical Investigation, published by the American Society for Clinical Investigation (ASCI).

Although the number of preprints in biology is growing (see graph), the model is facing resistance, especially among researchers from fields such as biochemistry and microbiology, says biologist Jessica Polka, from the University of California San Francisco. “Biologists are historically reluctant to publish preprints because there is no culture in the field for sharing manuscripts in this way,” she explains. Jessica is the director of ASAPbio, an organization founded in 2015 to promote the use of preprints in the life sciences. The initiative recently evaluated what researchers think of the model. While 92% of the survey’s 392 participants were aware of what a preprint is, only 31% had shared a paper this way. Despite this, 78% admitted to having used preprints as a source of scientific information.

According to Jessica Polka, the main concern about preprints among biologists is that they may be seen as inferior articles that have not undergone peer review. She points out, however, that 65% of the articles on astronomy, astrophysics, nuclear physics, and particle physics published in journals between 1995 and 2011 had already shared preliminary versions on arXiv. “Preprints do not aim to compete with conventional publishing in journals. They act as a complement,” she says. The initiative has inspired similar projects in other countries. In Brazil, the Scientific Online Electronic Library (SciELO) announced in February that it will launch its own preprints repository. According to the statement by the directors of the library, the objective is to speed up the publication of articles in the country.

Olavo Amaral, a professor at the UFRJ Institute of Medical Biochemistry, notes that institutions that fund research often delegate the evaluation of scientific quality to journals, due to their trust in peer reviews. “But peer review alone is not enough to ensure the quality of an article,” he says. In the traditional model, he explains, most journals rely on voluntary reviewers, and each study is usually evaluated by three reviewers at most. “With the preprint model, an article can be placed under the critical eye of hundreds or even thousands of peers,” says Amaral. In 2016, he debated the subject at the Brazilian Academy of Sciences (ABC) in Belo Horizonte, alongside Stevens Rehen and Eduardo Fraga, a professor at the UFRJ Institute of Physics.

Amaral says he has published two preprints on bioRxiv: a review of the scientific literature on biomarkers in psychiatry, and another on a model used to study memory in rodents. He believes that the response to the articles on social media was fast and positive, with several tweets citing the papers just a few hours after they were published. However, there was little feedback on the repository in terms of critique and comments. “Researchers need to be encouraged to comment and propose changes to the preprints that they read. This is still not a common habit among biologists,” he says.

In the case of particle physics, the situation is more favorable: preprints have taken center stage in debates on new theories. One of the most recent examples is that of the 750 GeV diphoton, a weak data signal that was found at the Large Hadron Collider (LHC) in December 2015. The news of a possible new particle resulted in many theoretical papers trying to describe and explain the discovery, lots of which were shared on arXiv. “Physicists are used to writing speculative or inconclusive articles. There is nothing wrong with that. The best case scenario is that we discover something new from the exchange of ideas,” explains Mihailo Backovic, a researcher at the Center for Cosmology, Particle Physics, and Phenomenology at the Catholic University of Louvain in Belgium.

There are currently more than 210,000 active researchers registered on arXiv, and the number is growing by around 10% per year. The website receives more than 1.2 million hits and 500 to 1,000 new manuscripts every day. “Of course there are a lot of bad ideas among all this, some of them even plain wrong. That’s why the peer review model remains relevant,” says Backovic. Preprint repositories are often asked whether they check the quality of the articles they publish. Some of them, including arXiv, have invested in software that can identify duplicate words and possible cases of plagiarism. They also require researchers to prove their links to institutions.

How to navigate
But how does one navigate the repositories, given the enormous volume of preprints available? “A common practice among physicists is to check arXiv for new papers each morning,” says George Matsas, a professor at the Institute for Theoretical Physics at São Paulo State University (UNESP). He says that the repository has created tools to help researchers access desired content. “ArXiv is updated once a day with preprint lists organized by subareas. In my case, I always check the categories on general relativity and quantum physics, where 20 to 30 papers are published a day,” explains Matsas.

Scientists sharing information before publishing in journals is not a modern phenomenon. Before the Internet, preprints were mailed to interested researchers and libraries before being formally peer-reviewed. Between 1961 and 1966, the NIH in the United States promoted the establishment of information exchange groups in research institutions and universities, encouraging the organized distribution of preprints. In Brazil, a pioneering system dedicated to publishing preliminary studies was established in 1952 by Notas de Física magazine, published by the Brazilian Center for Research in Physics (CBPF).

Physicist Francisco Antonio Doria, who obtained his doctorate at the CBPF in 1977, says that he published several preprints over the course of his career and considers the model more democratic than the conventional system. “Peer review often becomes an insurmountable barrier if a researcher proposes radical or controversial ideas,” he says. Repositories offer space for controversy, says Doria, but feedback is not always friendly. “I once published a paper about computational complexity on arXiv. My conclusions went against current trends of thought, and I received many aggressive comments—I was even cursed at.”

Eduardo Fraga from UFRJ believes that exchanging information is fundamental to the advancement of science, and that it can occur in different ways, such as through departmental meetings, conferences, or email groups. “What electronic preprint repositories can do is amplify the scope of this practice,” he observes. “Researchers located in peripheral regions, far from large knowledge-producing centers, can consult repositories like arXiv and keep up to date with what is being studied in a particular discipline.”

Another field that has been exploring the use of preprints in more depth is the human sciences. Last year, the Center for Open Science, a non-profit organization based in the United States, launched SocArXiv, intended for researchers in sociology, law, education, and the arts. The institution also hosts other repositories such as PsyArXiv, which is dedicated to psychology research. “Our challenge is to convince social science researchers to participate,” says sociologist Elizabeth Popp Berman, a professor at the University of Albany in the United States and a member of the SocArXiv steering committee. She says that in general, most social science researchers do not feel the need to publish their papers as quickly as their peers in the natural sciences. “On top of this, researchers in some areas of the humanities prefer to publish books, although in sociology the production of articles is becoming more of a priority,” she says.

Like other repositories, SocArXiv is funded by donations, including from the Massachusetts Institute of Technology and the University of California, for example. One of the good things about preprints is that the costs of maintaining online repositories are lower than the costs of producing a journal. ArXiv, for example, hosts more than 1.2 million articles at an annual cost of roughly US$827,000—about US$0.70 per article. Repositories do not charge authors, and the preprints are freely available to the public. On the other hand, the publication fees charged by journals run by Dutch publisher Elsevier, for example, can vary from US$500 to US$5,000.

Other human science fields are more accustomed to working with preprints. Economics researchers, for example, have access to a number of established repositories, such as the National Bureau of Economic Research (NBER) repository and the Research Papers in Economics (RePEc) project, which was launched in 1997 and consolidates more than 1,800 files from electronic repositories and libraries in 89 countries. In the social sciences, the pioneering Social Science Research Network (SSRN) has been operating since 1994. Lawyer Douglas Castro, a postdoctoral intern at the Getulio Vargas Foundation in São Paulo (FGV-SP) who works in ​​environmental law, began using the SSRN in 2014. “Publishing in highly-ranked journals does not guarantee that your article will be widely read,” he says. “I started using preprints to give my work greater visibility among researchers in other fields beyond law,” says Castro, who uses the suggestions he receives from repository users to prepare the final version of his articles. A preprint he published last month, for example, attracted the attention of a geologist, who proposed a better definition for the term “water scarcity.”

In May 2016, Elsevier acquired the SSRN. “The goal is to provide greater access to the growing library of user-generated content and to increase engagement with a wider range of researchers,” explains Gemma Hersh, policy and communications director at Elsevier. While recognizing the important role played by preprints, Gemma says that publication in a journal is still the best way to evaluate the merits of a study. “The role of reviewers and editors is crucial,” she emphasizes. “They guarantee that a preprint becomes a trustworthy article, which is perfected after undergoing peer review.”

However, this process may not always appear completely necessary. In a study published in 2016, researchers at the University of California analyzed about 12,000 preprints shared on arXiv between 2003 and 2015, and compared them with the final versions published in scientific journals. The study found that in 80% of the articles, there were no significant differences between the two versions. Stevens Rehen, from UFRJ, believes the role of journals will start to change as the use of preprints grows. He and other enthusiasts of the model have suggested that publishers could act as curators of scientific information. “Instead of determining what should or should not be published, they could search repositories for the preprints they consider most relevant to publish as articles,” he proposes.