{"id":548510,"date":"2025-06-10T14:15:38","date_gmt":"2025-06-10T17:15:38","guid":{"rendered":"https:\/\/revistapesquisa.fapesp.br\/?p=548510"},"modified":"2025-06-10T14:15:38","modified_gmt":"2025-06-10T17:15:38","slug":"experts-analyze-the-risks-and-benefits-of-using-artificial-intelligence-to-measure-researcher-performance","status":"publish","type":"post","link":"https:\/\/revistapesquisa.fapesp.br\/en\/experts-analyze-the-risks-and-benefits-of-using-artificial-intelligence-to-measure-researcher-performance\/","title":{"rendered":"Experts analyze the risks and benefits of using artificial intelligence to measure researcher performance"},"content":{"rendered":"<p>Artificial intelligence (AI) could be used to perform scientific evaluations currently only entrusted to human reviewers, suggests a study published in the <em>Journal of Informetrics <\/em>in November. The objective of the article, written by researchers from the Federal University of Rio Grande do Sul (UFRGS), was to identify criteria and attributes capable of defining whether a researcher should be awarded a Research Productivity (RP) grant from the Brazilian National Council for Scientific and Technological Development (CNPq). Currently distributed to around 15,000 researchers, these grants act as a complement to remuneration offered by institutions as payment for a person\u2019s work and for supervising students.<\/p>\n<p>The team analyzed the CVs of 133,000 researchers, 14,138 of whom were awarded RP grants between 2005 and 2022. The authors used a series of machine learning techniques to examine the r\u00e9sum\u00e9s of grant candidates and were able to identify which researchers would be considered with a reasonable degree of accuracy. The method was 80% accurate in one of the grant categories\u2014the RP-2 grant, which is aimed at younger researchers and is awarded based primarily on the number of published articles and student supervisions. \u201cFor other grant levels, the tool performed well but with less accuracy, since the decision is based on a more qualitative analysis,\u201d said Denis Borenstein of the UFRGS\u2019s School of Business Administration, one of the authors.<\/p>\n<p>According to Borenstein, it was only possible to create the model because there is a large volume of data on researchers\u2019 work on the Lattes r\u00e9sum\u00e9 website, which were used to train the machine learning algorithms. He believes the tool could be useful at least for screening candidates and making the work of human reviewers easier. \u201cReviewers could then analyze a smaller volume of proposals more calmly and carefully,\u201d says Borenstein, an expert in applied operations research, an interdisciplinary field that uses algorithms and mathematical and statistical methods to aid decision-making.<\/p>\n<p>Olival Freire, scientific director of the CNPq, agrees that AI will be useful in the agency&#8217;s evaluation procedures, but warns that it needs to be adopted gradually and cautiously. \u201cPoorly trained or misused AI systems can have terrible results. You need to carefully curate the algorithms to make sure the analysis is consistent,\u201d he says. Freire notes that the CNPq already uses AI in tasks such as the selection of reviewers for grant and scholarship applications. \u201cThe system runs through the list of 15,000 research productivity grant recipients and identifies a small range of experts on the topic in question. They are then contacted by CNPq technicians,\u201d he says.<\/p>\n<p>This strategy, according to Freire, prevents bias in the reviewer selection process, such as repeatedly inviting those who quickly accept the task or submit their reviews. People applying for RP grants are now also allowed to use generative AI to help write their proposals, as long as they declare that they did so. The CNPq director states, however, that the widespread use of AI could conflict with the more qualitative approach that the agency aims to take in its evaluation process. \u201cSome CNPq disciplinary committees are judging requests for productivity grants in two stages. In the first, more quantitative in nature, they analyze data on scientific production, citations, and number of supervisions. This can be supported by algorithms. In the second, candidates are invited to highlight numbers, but their main achievements, which could be influential articles, patents, or works of art, are judged by peers. This could not be done properly by artificial intelligence,\u201d he explains.<\/p>\n<p>Artificial intelligence is already being used to organize and analyze large volumes of research data and even to identify or model protein structures that could lead to new drugs. Physicist Osvaldo Novais de Oliveira J\u00fanior, current director of the S\u00e3o Carlos Physics Institute at USP, showed that AI is highly successful at predicting whether a scientific article will receive a large number of citations. Together with colleagues from USP and Indiana University, USA, he uploaded a paper on the subject\u2014yet to be peer-reviewed\u2014to the arXiv repository. In the study, the group used AI to examine the abstracts of 40,000 articles published in the American Chemical Society\u2019s journal <em>ACS Applied Materials &amp; Interfaces<\/em> between 2012 and 2022. The method was able to indicate, with 80% accuracy, which abstracts were among the 20% most cited, based only on the words used and the topics covered, without considering the authors or the institutions with which they were affiliated. Novais, a scholar of computational linguistics, claims that the mastery of human language by computers will enable them to perform all kinds of intellectual activities and that their enormous data processing capacity will soon lead them to surpass human intelligence. \u201cIt is likely that in the not-too-distant future, around 2027, we will reach the technological singularity, at which point AI will surpass human capacity,\u201d he says.<\/p>\n<p>Marcio de Castro Silva Filho, scientific director at FAPESP, believes the trend of using AI in the review process is irreversible. \u201cScientific publishers already use the technology to screen and analyze submitted scientific manuscripts. Funding agencies are also moving in this direction, developing tools to support their reviewers,\u201d he says. \u201cAt FAPESP, we are discussing how algorithms could allow us to extract information from proposals and help the people reviewing them.\u201d It is essential, according to Silva, to be transparent about the use of AI when it is incorporated into tools like these, in addition to being clear about what criteria they analyze.<\/p>\n<p>However, it may be a while before algorithms can feasibly be used for more complex tasks in the review process, according to physician Rita Barradas Barata, who was director of reviews at the Brazilian Federal Agency for Support and Evaluation of Graduate Education (CAPES) between 2016 and 2018. \u201cIt\u2019s one thing to use algorithms to process large volumes of data, but interpreting the data in a way that captures all the nuances needed to review a proposal is something else entirely.\u201d The researcher, who was responsible for completing the 2017 version of the postgraduate program assessment released every four years by CAPES (<a href=\"https:\/\/revistapesquisa.fapesp.br\/en\/advances-in-graduate-studies\/\" target=\"_blank\" rel=\"noopener\"><em>see <\/em>Pesquisa FAPESP <em>issue n\u00ba 260<\/em><\/a>), says that the evaluation has to consider multiple dimensions of the performance of master&#8217;s and PhD courses, such as the context in which they operate and regional vocations, all of which will need to be analyzed by any algorithm developed to help.<\/p>\n<p>Jacques Marcovitch, who was dean of USP between 1997 and 2001, sees risks in the predictive use of AI. One is that it inhibits the adoption of the principles of \u201cresponsible evaluation,\u201d which aim to introduce qualitative parameters, based on peer review, into the analysis of scientific results. \u201cAlgorithms are capable of reading a large amount of content, but they always look to the past, to data accumulated over time. In an era of disruption, this would limit the identification and recognition of the science that will shape our future,\u201d he says. Marcovitch is head of the M\u00e9tricas Project, an effort that brings together researchers from multiple institutions with the aim of developing comprehensive ways to measure the impact of universities on society.<\/p>\n<p>He emphasizes that this does not mean that AI cannot be useful. The most important thing, he says, is that the results are used by trained people who understand the limitations and know how to interpret them. Justin Axel-Berg, who is also involved in the M\u00e9tricas Project, warns about the lack of transparency regarding the parameters adopted by generative AI algorithms. \u201cIt would be a great risk to use these programs to determine who receives public funding for research and scholarships. What would you say to a candidate who was unhappy with the assessment result? That it was the algorithm that said no?\u201d he asks.<\/p>\n<div class=\"box\"><strong>Multidimensional analysis<\/strong><br \/>\n<em>Qualitative criteria adopted for chemical engineering<\/em><\/p>\n<p>In the field of chemical engineering, the CNPq is awarding Research Productivity (RP) grants based on different criteria than those adopted for other disciplines. Instead of being limited to traditional quantitative indicators, such as the number of articles, citations, and students supervised, the focus is on measuring the contribution of the applicant in a broader way, evaluating everything from the academic impact of their scientific work to their role in training others, efforts to establish collaborations, and leadership in scientific and innovative projects. The rules for the next three years were discussed by the CNPq\u2019s advisory committee for chemical engineering over the last four years and began to be implemented in October, after being discussed with the community. \u201cThe idea is to consider the qualitative nature of the research, with the aim of discouraging predatory publishing practices that artificially inflate scientific production,\u201d says Claudio Dariva, a researcher at Tiradentes University in Aracaju, Sergipe, and chair of the committee.<\/p>\n<p>The search for new criteria is partly motivated by the fierce competition for RP grants in chemical engineering. Based on the average of the last three evaluations for RP grants (2021, 2022, and 2023), only 35 of every 100 researchers who apply in the area are successful, the lowest level among all engineering programs at the CNPq. To give an idea of the asymmetry: the average success rate for applicants across all engineering departments in the period was around 45%, with some disciplines reaching 60%. \u201cIf we use merely quantitative criteria to award grants, many researchers who have made important contributions to society would not be able to compete. Our guiding question has always been: what does it mean to be a productive researcher?\u201d says Maria Alice Zarur Coelho, a researcher from the Federal University of Rio de Janeiro (UFRJ) who led the advisory committee during part of the period during which the new criteria were formulated. \u201cThe objective is to perform a multidimensional analysis of the applicant, which serves as a guide and prioritizes the impact of their research more comprehensively,\u201d says Marisa Beppu, a researcher from the University of Campinas (UNICAMP) who also led the CNPq\u2019s chemical engineering advisory committee.<\/div>\n<p class=\"bibliografia separador-bibliografia\">The story above was published with the title &#8220;<strong>Measured by algorithms<\/strong>&#8221; in issue 346 of December\/2024.<\/p>\n","protected":false},"excerpt":{"rendered":"Study used machine learning to analyze r\u00e9sum\u00e9s and predict who would receive a CNPq grant","protected":false},"author":11,"featured_media":537399,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[166],"tags":[219,2413],"coauthors":[98],"class_list":["post-548510","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-policies-st-en","tag-computation","tag-technology"],"acf":[],"_links":{"self":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/548510","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/comments?post=548510"}],"version-history":[{"count":1,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/548510\/revisions"}],"predecessor-version":[{"id":548511,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/posts\/548510\/revisions\/548511"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/media\/537399"}],"wp:attachment":[{"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/media?parent=548510"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/categories?post=548510"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/tags?post=548510"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/revistapesquisa.fapesp.br\/en\/wp-json\/wp\/v2\/coauthors?post=548510"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}