Bad translations signal misconduct in scientific articles : Revista Pesquisa Fapesp

A group of researchers from France and Russia investigated the frequent use of meaningless expressions in articles published in computer science journals. Instead of using the well-known term artificial intelligence, for example, some papers have referred to the field as “counterfeit consciousness.” The term big data has at times been replaced by something with a similar meaning but never used in the context: “colossal information.”

In a study published on arXiv last year, computer scientists Guillaume Cabanac of the University of Toulouse, Ciryl Labbé of Grenoble Alpes University, and Alexander Magazinov from Russian software company Yandex concluded that these terms, which they dubbed “tortured phrases,” can signal various types of misconduct. The most common is plagiarism. These expressions appear strange because they are automatically translated from English into another language and then converted back into English, with the aim of altering the phrases to fool plagiarism detection software.

As well as seeming strange, these tortured phrases make papers difficult to understand, inaccurate, or simply wrong. They could also be an omen of a more serious problem. Tortured phrases were found in totally fraudulent articles generated by AI language programs. “Unlike papers where authors seem to have used paraphrasing software, which changes existing text, these AI models can produce text out of whole cloth,” explained Cabanac, Labbé, and Magazinov in an article published on the Bulletin of the Atomic Scientists website in January. They refer specifically to a neural network called GPT-2, developed by American private research institution OpenAI, which is capable of generating coherent structures that appear to have been written by a human. Last year, the group screened 140,000 abstracts using a program—created by OpenAI itself—that can detect text generated by GPT-2. “Hundreds of suspect papers appeared in dozens of reputable journals,” the trio wrote.

Computer programs that generate fake articles are nothing new, but until recently the results were so bad that they were only capable of fooling people paying little attention or being highly negligent. In 2005, three students from the Massachusetts Institute of Technology (MIT) created a program called SciGEN that can combine sequences of words extracted from genuine scientific papers to create new texts—although they do not make any sense (see Pesquisa FAPESP issue no. 219). That same year, they submitted one of these manuscripts to a world conference on cybernetics and computing taking place in the USA and managed to get it published—the MIT group’s goal was to highlight that the peer-review process for conference proceedings is often performed poorly. The tool that started as a joke, however, was later adopted by fraudsters. In 2012, Labbé showed that the MIT software, freely available online, was being used for wrongdoing—he found articles generated by SciGEN in the proceedings of more than 30 conferences. Labbé subsequently developed a program to identify these texts using keywords, which was adopted by scientific publishers to prevent the problem.

Advances in AI have breathed new life into this type of fraud. Cabanac and his colleagues then created a more powerful system called the Problematic Paper Screener, which identifies articles containing tortured phrases. Volunteers compiled frequently mistranslated expressions in papers from various fields of knowledge to feed the screener’s database. In this database, “irregular esteem” is identified as a supposed equivalent of “random value,” a term commonly used in statistical analysis. Another bizarre example is the term “bosom peril,” which appeared in place of “breast cancer.”

Not even COVID-19 studies have been able to escape it: in some papers, Severe Acute Respiratory Syndrome (SARS) was converted to Extreme Intense Respiratory Syndrome. One of them, authored by Egyptian doctor Ahmed Elgazzar of Benha University, was removed from the Research Square preprints platform after evidence emerged of misconduct that went beyond bad translations. The study, which suggested that the dewormer ivermectin was effective against SARS-CoV-2, was considered invalid due to discrepancies between the raw research data and the clinical trial protocols. In addition to the strange translation of SARS, there was evidence the author plagiarized from ivermectin press releases that his paraphrasing trickery could not hide.

Journals that publish manuscripts with poor translations may have other problems with their quality control processes. Cabanac also searched the Dimensions database for scientific documents containing the terms he compiled. He detected 860 articles containing at least one tortured phrase, 31 of which were published in the same journal: Elsevier’s Microprocessors & Microsystems. The computer scientist decided to download all the papers published by the journal between 2018 and 2021 and analyze them in depth. He found roughly 500 problematic cases—most of which involved irregularities in the peer-review process. Most of the questionable papers had been published in special issues and had identical submission, revision, and acceptance dates—evidence that they were not well reviewed.

A parallel investigation by Elsevier corroborated the Frenchman’s findings and the publisher subsequently retracted or removed 165 articles. “The integrity and rigor of the peer-review processes were investigated and confirmed to fall beneath the high standards expected by Microprocessors & Microsystems,” the editor of the journal said in a statement. There were also indications that many of the special issues contained “non-original and heavily paraphrased” content.

Elsevier was not the only publisher grappling with the problem. In March, UK-based IOP Publishing announced the retraction of 350 articles published in two journals—The Journal of Physics: Conference Series and IOP Conference Series: Materials Science and Engineering—which disseminate physics, materials science, and engineering conference proceedings. Many contained tortured phrases. These were discovered by Nick Wise, an engineering student at the University of Cambridge who used the screener created by Cabanac’s group to analyze the journals.

Republish