Artificial intelligence in peer review : Revista Pesquisa Fapesp

Swiss publisher Frontiers, which publishes more than 90 open access scientific journals, has developed a computer program that uses artificial intelligence (AI) to analyze scientific papers and identify up to 20 different integrity issues, including plagiarism and manipulated images. The software has captured attention, however, for another innovative service it provides: it issues a warning when the authors of a manuscript and the editors or reviewers assessing it have worked together in the past in a way that could present a conflict of interest and compromise the subjectivity of the review. Known as the Artificial Intelligence Review Assistant (AIRA), it is being used in the peer review process by Frontiers journals to provide objective information about previous interactions between the people writing papers and those evaluating them. A real person—usually the editor in chief of the journal—is ultimately responsible for deciding whether or not these past relationships will have an impact on the review. “We need technological support to identify conflicts of interest between authors and reviewers on a large scale,” said computer scientist Daniel Petrariu, product development director at Frontiers, at the program launch in July.

The Committee on Publication Ethics (COPE), which discusses issues related to integrity in science, defines a conflict of interest as any situation in which researchers or scientific institutions have competing interests, whether professional or economic, that could compromise the impartiality of their decisions. Conflicts of interest, notes COPE, do not necessarily constitute misconduct: a study involving a conflict can still have perfectly valid and reliable results. But when such a situation occurs, the problem needs to be dealt with transparently. It is for this reason that researchers are required to sign a statement declaring any conflicts of interest whenever presenting a project or publishing an article. The person assessing the project or reading the paper has to consider the results in light of these conflicts.

A typical example of misconduct is when a scientist hides that they have received research funding from a private company that has an economic interest in the results. This was the case with oncologist José Baselga, who resigned from his position as clinical director of the Memorial Sloan Kettering Cancer Center in New York after acknowledging that he failed to disclose connections with pharmaceutical and biotechnology companies in dozens of scientific papers (see Pesquisa FAPESP issue no. 272). There have also been cases where conflicts of interest have impacted the objectivity of the peer review process. In 2017, the Scientific World Journal, published by Hindawi, announced the retraction of two articles after learning that the author, Zheng Xu, frequently worked together with Xiangfeng Luo, editor of the journal and a researcher at Shanghai University. The pair have coauthored dozens of articles together and Luo was Xu’s PhD advisor.

The aim of the AIRA software is to detect potential sources of conflicts of interest that could otherwise go unnoticed. But its scope is limited—it can only identify issues when there is data available to corroborate them. And some conflicts occur in a gray area. If the author and the reviewer receive funding from the same source and omit to declare it, the software will not know.

A report on the future of peer review, published in 2017 by the publisher BioMed Central, showed that AI is being used in various ways to evaluate scientific articles. There are publishers using it to track scientific literature online in order to identify researchers with the profile needed to work as reviewers in their disciplines. It is also being used in new software designed to detect plagiarism, identifying when sentences or paragraphs have been rewritten to circumvent algorithms that only recognize identical copies.

The most contentious example is Statcheck, a program created in 2015 by researchers from Tilburg University in the Netherlands (see FAPESP Research issue no. 253) and now used on a daily basis by hundreds of journals. The software has caused controversy not because of the technology it uses, which is capable of repeating the calculations described in published scientific articles and finding statistical errors, but because of the way it has been used. Fifty thousand psychology articles were analyzed by the software, with the results published on the PubPeer platform, publicly exposing the minor and major mistakes made by thousands of authors. More than 25,000 articles had inconsistencies.

According to psychologist Michèle Nuijten, a professor at Tilburg University and one of the creators of Statcheck, tools like AIRA can be valuable in the peer review process. “We need to look for different solutions that can help us in increasing the quality and robustness of scientific studies, and AI could definitely play a role in that,” she said in an article about AIRA published in The New York Times. But Nuijten warns that these tools should only be used to support the work of editors and that it should always be people, not machines, that decide whether a paper should be published or not. “I would be concerned if AI technology led to the rejection of an article without really checking what’s going on.”

Funding agencies also have rules to prevent conflicts of interest when assessing research proposals. For many years, FAPESP has requested that reviewers chosen to analyze proposals declare any related interests. Potential reviewers are ruled out in advance if they have family ties to the applicants, affiliations with the same institutions, or if they have already participated in the project. In May, the foundation began using an automatic system, developed by its informatics department, to identify potential reviewers. The system compares parameters, such as project title and keywords, against a database of reviewer profiles and then identifies those best suited to assessing the proposal, in order of preference. The algorithm that classifies the keywords uses AI. As well as being more objective, the system automatically checks whether the reviewers have links to the applicants.

Republish