In search of more refined metrics : Revista Pesquisa Fapesp

An article published in February in the journal Scientometrics discussed a computer tool that can be used to assess the scientific production of researchers. It is an algorithm capable of collecting information about a set of papers of a specific author and to analyze them, for example, with regard to the level of engagement this production has with hot topics of the discipline or its impact, through citations in other works, on the production of colleagues with similar interests. The authors of the manuscript, materials engineer Edgar Dutra Zanotto, from the Federal University of São Carlos (UFSCar), and IT student from the university Vinícius Carvalho, developed the algorithm to compile and process data from two indicators provided by SciVal, the analytical platform linked to the Scopus database, of the publishing house Elsevier.

One of indicators is the Field-Weighted Citation Index (FWCI), which assesses citations of an article, comparing them with those of other papers with similar keywords and the same age. The index considers how much this work was more or less cited in comparison with the average of similar works. If the result is equal to 1, this means that it is exactly the same as the average—if higher, it has a higher impact. The advantage of the methodology is that it facilitates the comparison of studies in any field of knowledge, considering their position in relation to those of the same topic. Other indicators of scientific impact do not allow for this type of analogy because each discipline has unique publication practices and communities of distinct sizes that influence the intensity with which their research is cited without this representing a difference in impact or visibility.

The second indicator is Topic Prominence Percentile (TPP), which shows how much the topics of an article are attuned with the currently most discussed subjects in their field of knowledge. The prominence percentile is calculated based on the weighting of the number of citations a paper received in the last two years, the impact factor of the magazine in which it was published, and the number of views it has had on the internet. Its content is compared with a type of ranking of the research topics, also using citation standards, suggesting how much it converges with the subjects that most interest journal publishers or attract agency funding. “The problem is that, to calculate the global performance of a researcher, you must take each publication individually, which takes a lot of time. The software we developed collects, within the Scopus database, the FWCI and TPP of a set of papers of an author and seeks to provide a broad snapshot of their production,” explains Zanotto.

A preliminary test of the algorithm, which was done with the production of a researcher who has published 226 articles since 2000, was completed in only 35 minutes. Some results were surprising. The FWCI of the collection of most recent works was very similar to that of the oldest collection, a sign that the weighting made by the index corrected distortions and recorded consistent performance for that scientist. It was also noted that production carried out with international collaboration is related to a higher FWCI and its prominence rate was high. The software then assessed the production of 15 senior researchers of the 1A level at the Brazilian National Council for Scientific and Technological Development (CNPq)—whose careers span 30 to 50 years—and 12 young prolific researchers, selected from different areas, such as chemistry, physics, astronomy, mathematics, biology, and materials. It was observed that almost all in the sample had an average FWCI greater than 1. A second analysis was undertaken, standardizing the articles according to the number of authors for each one. As a result, two senior researchers from the area of mathematics, who signed their projects with only two or three colleagues, continued to have an FWCI greater than 1 and surpassed the others, from other areas, who generally share authorship with other coauthors. “The proposed algorithm and the resulting metrics provide a new tool for scientometrics,” claims the researcher, referring to the discipline that studies the quantitative aspects of science.

The leader of one of the world’s leading research groups in the nucleation and crystallization of glasses, Edgar Zanotto has been coordinating since 2013 the Center for Research, Education, and Innovation in Vitreous Materials (CeRTEV), one of the Research, Innovation, and Dissemination Centers (RIDC) funded by FAPESP. His involvement with scientometric studies is sporadic—among the more than 350 articles he has written, only five are on this topic. His interest in this subject area dates back to the early 2000s, when he was a member of the support team for the exact sciences and engineering at FAPESP. “At that time, I reviewed resumes of scientists and scholarship students all the time, and a series of metrics to assess the production of researchers began to arise, such as the h-index,” he says, referring to the indicator proposed in 2005 by Argentinian physicist Jorge Hirsh, which combines the number of articles with citation consistency, and which has become widely used for its ease of calculation. In search of broader parameters, Zanotto published again in Scientometrics, in 2006, a study suggesting a method of classifying researchers based on 11 criteria, such as international awards won, number of articles published in high impact magazines, and level of research funding obtained from funding agencies and companies (see Pesquisa FAPESP issue no. 124). The idea helped to foster an academic debate but has not been adopted in large scale. But Zanotto continues to use this classification system today when assessing resumes of researchers in projects and awards.

From 2017 to 2019, Zanotto chaired the Scientific Board at the Serrapilheira Institute, a private research funding agency based in Rio de Janeiro, during which time his interest in evaluating metrics became renewed. He discovered almost 100 available indicators. For example, the g-index shows when a researcher has had, among his/her portfolio of articles, some with many citations—a characteristic that the h-index cannot capture. The growth in projects with hundreds of authors also led to the creation of indexes that consider the weight of the contribution of each, avoiding distortions when comparing with papers that have few signatories. At Serrapilheira, Zanotto studied various metrics to complement the analysis of research projects at the institute. “The analysis of productivity, quality, and visibility of scientists is relevant, but very complex, and still without a clear solution that is adopted universally. Indicators, such as the h-index, have been used by many, but not by all. In the case when they are not well understood, they can and have caused countless distortions. The FWCI reasonably corrects such distortions, albeit not perfectly,” explains Zanotto.

The creation of new bibliometric indicators was driven by the development of the science of data, which allows for the extraction of trends from large volumes of information. Companies, such as Elsevier and Clarivate that maintain the databases Scopus and Web of Science, respectively, have begun to sell the service of scientific production analysis to managers, universities, and funding agencies with increasingly more refined metrics, albeit with the majority still based on the number of articles and citations. Sergio Salles-Filho, coordinator of the Laboratory for Studies on the Organization of Research & Innovation at the University of Campinas (GEOPI-UNICAMP), who works with the evaluation of academic and technological production, shares that the analyses have been improved in recent years thanks to these new metrics. “The evaluation results have been more calibrated, establishing appropriate comparisons and avoiding distortions,” he confirms. In a recent report about the evaluation of three FAPESP programs (regarding support to small businesses, international collaborations, and researcher training), Salles-Filho added to the results data about the prominence rate of the completed research projects, with the caveat that a low result does not mean lack of quality, but may simply indicate that a research topic of great interest in Brazil is not at the top of the agenda for the international community (see Pesquisa FAPESP issue no. 297).

Beyond indicators that standardize the contribution of researchers according to the characteristics of their discipline, so-called alternative metrics, or altmetrics, have also earned a place in evaluation processes, exposing scientific articles to the press and social media (see Pesquisa FAPESP issue no. 250). The database Dimensions, created in 2018, provides an altmetric summary index comprised of more than a dozen parameters, including citations on blogs, references in tweets, likes on Facebook, or shares on the academic social network Mendeley. “Dimensions is only able to detect the impact of a project in the press or on social media if it includes its DOI (Digital Object Identifier) registration. For this reason, researchers have begun to include the DOI when sharing their studies on Twitter or Facebook,” explains Salles-Filho.

Despite the usefulness of the new indicators, debate continues regarding how much these metrics can recognize the quality of a scientific study and if they have the potential to substitute pair review, where the content, degree of innovation, and originality of the production of an author are analyzed by researchers in their area to measure the value of their contribution. For biochemist Jorge Guimarães, who was head of the Brazilian Federal Agency for Support and Evaluation of Graduate Education (CAPES) between 2004 and 2015, bibliometric indicators offer important information and make it possible to evaluate a high volume of data. “Despite criticism, these metrics are increasingly more used by the academic sector because there are few alternatives. Imagine a researcher who has written 200 papers. Will an evaluator read all of them? This is not viable. First, I need to know which articles impact the scientific community and receive more citations. And how many were not cited—if the proportion is high, this is a relevant piece of data,” says Guimarães, who today is head of the Brazilian Agency for Industrial Research and Innovation (EMBRAPII).

In the 2000s, CAPES created QUALIS, a classification system for scientific journals that is used to evaluate the production of postgraduate programs in Brazil. The system is being reviewed and tends to be criticized for measuring the impact of an article, not by the number of citations that it actually received, but rather by an indirect parameter: the citation index of the journal in which it was published. “It is an unjust criticism because the CAPES evaluation focuses on what was published in the previous four years and, in this brief period of time, the number of citations that each article receives is small and would not serve as a means of evaluation,” he confirms.

Mathematician and engineer Renato Pedrosa, coordinator of the indicators special program at FAPESP, recognizes the value of bibliometric indices in evaluation processes, but considers it prudent to use them sparingly. “At the core, all of them are based on the same parameters: articles and their citations. It would be reasonable to use them, for example, when evaluating the production of a university or department, but the analysis of an individual production of a researcher requires greater care and detail,” says Pedrosa, who is a researcher for the Department of Scientific and Technological Policy at UNICAMP. He mentions, as an example, the evaluation process of the universities and laboratories in the United Kingdom, carried out every five years—instead of focusing on all academic production during the period, asks researchers to select the two or three more significant works so that they can be analyzed in greater depth by reviewers.

The concern is in line with the Leiden Manifesto about research metrics, conceived in the Netherlands in 2015, which alerts about the indiscriminate use of indicators in decision making of universities and funding agencies. “We run the risk of damaging the science system with the actual tools designed to improve it, since the evaluation is increasingly conducted by institutions without the required knowledge of best practice and adequate interpretation of indicators,” states the manifest. Pedrosa also calls attention to a significant limitation of the bibliometric indicators, which is their incapacity to evaluate researchers at the beginning of their careers. “Even if they have talent and potential, young researchers have not had time to produce articles and receive citations,” he shares. One of the advantages of the FWCI is that it standardizes the number of citations received according to the age of the paper.

Scientific article
ZANOTTO, E. D. et al. Article age- and field-normalized tools to evaluate scientific impact and momentum. Scientometrics. Feb. 25, 2021.

Republish