Automatic conversion in Excel changed to prevent errors in genetic studies

Microsoft has announced changes to an Excel feature that has become infamous for introducing errors into research data and disrupting the work of geneticists—users will now be able to easily disable automatic formatting functions. The original purpose of the software’s automatic conversion feature is to facilitate the insertion of frequently repeated standardized data, which includes converting text to dates. The problem is that Excel interpreted the abbreviation SEPT1, which is often used for the septin-1 gene, as the first day of September, automatically converting it to 09/01. Another gene, MARCH1, was converted to 03/01.

The problem became so widespread that in 2020, the Gene Nomenclature Committee of the Human Genome Organization (HUGO) changed the way some genes are written. Septin-1 was changed from SEPT1 to SEPTIN1, while MARCH1 became MARCHF1, in an effort to circumvent automatic conversion in Excel spreadsheets.

In a paper published in the journal Genome Biology in 2016, Australian researchers analyzed the spreadsheets used as basis for 3,587 genetics articles published in 18 journals. They identified errors attributable to gene-name conversion errors in 19.6% of the files. The journals with the highest proportion of articles that used affected spreadsheets were Nucleic Acids Research, Genome Biology, Nature Genetics, Genome Research, Genes and Development, and Nature. The problem could manifest in different ways depending on the language settings of the spreadsheet. In Spanish, the AGO2 gene was converted to August 2. In Dutch, the MEI1 gene could become May 1.

“There are no documented cases in which gene-name errors have affected the conclusions of a study. But if a researcher unknowingly saved a spreadsheet containing such errors, someone importing those data for further analysis would face a reproducibility problem,” explained Mandhri Abeysooriya, a researcher at Deakin University in Geelong, Australia, who participated in the study published in Genome Biology, according to the Retraction Watch website.

In a statement published on Microsoft’s blog, Excel product manager Chirag Fifadra said that the program will now show a warning message when automatic data conversions are enabled and options for disabling them will be more visible.