Algorithms are everywhere. When share prices rise and fall, they are usually involved. According to data released in 2016 by the Institute for Applied Economic Research (IPEA), investment robots programmed to instantly react to certain situations account for more than 40% of stock market transactions in Brazil. In the United States, the figure is 70%. The success of a simple Google search depends on these computer programming procedures, which are able to filter billions of web pages in mere seconds—the importance of a website, defined by an algorithm, is based on the quantity and quality of other pages that link to it. At the frontier of automotive engineering research, sets of algorithms used by autonomous cars process information captured by cameras and sensors, instantly making decisions at the wheel without human intervention.
Although they play a role in even the most mundane tasks, such as helping to avoid traffic through mobile applications, algorithms are often seen as intangible by the general population, who feel their effects but do not know or understand what they are or how they work. An algorithm is nothing more than a sequence of steps used to automatically solve a problem or accomplish a task, whether it requires a dozen lines of programming code or a million. “It is the nucleus of any computational process,” says computer scientist Roberto Marcondes Cesar Junior, a researcher at the Institute of Mathematics and Statistics of the University of São Paulo (IME-USP).
Take the sequence of steps performed by the Facebook algorithm, for example. The choice of what to display in a user’s news feed is based primarily on the set of posts produced by or circulating among their friends. The algorithm analyzes this information and discards posts flagged as violent or inappropriate, those that look like spam, or those whose wording is identified as “clickbait”—a form of exaggeration used to encourage users to click a link. Finally, the algorithm assigns a score to each post based on the user’s activity history to estimate how likely they are to enjoy or share the information. The algorithm has recently been modified to reduce the reach of posts made by news outlets.
The development of an algorithm involves three steps (see infographic). The first is to accurately identify the problem and find a solution to it. In this phase, computer programmers work with professionals who understand the task that needs to be performed. They could be doctors, in the case of an algorithm that analyzes imaging exams; sociologists, if the objective is to identify patterns of violence in certain regions of a city; or psychologists and demographers in the development of a dating application. “The challenge is to show that a practical solution to the problem exists, that it is not a problem of exponential complexity, for which the time needed to produce a response can increase exponentially, making it impractical,” explains computer scientist Jayme Szwarcfiter, a researcher at the Federal University of Rio de Janeiro (UFRJ).
The second phase is also free of any mathematical operation: it consists of describing the sequence of steps in normal language, for everyone to understand. This description is then translated into a programming language during phase three. Only then can the computer understand the commands—which can be simple, mathematical operations or complex algorithms within algorithms—all in a logical and precise sequence. It is during this stage that the programmers get involved, tasked with writing the algorithms. On complex projects, large teams of programmers work together and share tasks.
At their origin, algorithms are logical systems as old as mathematics. “The expression comes from a Latinization of the name of Persian mathematician and astronomer Mohamed al-Khwarizmi, who produced famous works on algebra in the ninth century,” explains computer scientist Cristina Gomes Fernandes, a professor at IME-USP. They gained new impetus in the second half of the last century alongside the development of the computer, with which it was possible to create work routines for the machines. There are two reasons why algorithms are now so widely used in the real world and why they have become the basis of most complex software development. Firstly, the increased processing power of computers has accelerated the speed at which complex tasks can be executed. Secondly, the advent of big data has made it cheaper to collect and store huge amounts of information, allowing algorithms to identify patterns that are imperceptible to the human eye in a wide range of situations. Advanced manufacturing, known as Industry 4.0, promises to increase productivity by using artificial intelligence algorithms to monitor industrial plants in real time and make decisions on stock control, logistics, and maintenance.
One effect of the growing use of algorithms in computing was a boost to artificial intelligence, a field established in the 1950s that aims to develop mechanisms capable of simulating human reasoning. Through increasingly fast computations and the collection of data for statistical comparisons, computers are now able to modify their operations based on accumulated experience, improving their performance in a process that mimics learning.
Robots are responsible for 40% of the decisions made on the Brazilian stock market
The fact that computers have proven capable of beating humans in many board games shows how the field has evolved. In 1997, IBM’s Deep Blue supercomputer succeeded for the first time in beating the world chess champion of the time, Gary Kasparov, from Russia. Capable of simulating approximately 200 million chess positions per second, the machine anticipated its opponent’s decisions several moves ahead. But this strategy did not work for Chinese board game Go, because there are too many possible moves at any given time to anticipate—the range of possibilities is greater than the number of atoms in the universe. But in March 2016, Go was finally defeated: the AlphaGo program, created by DeepMind, a subsidiary of Google, managed to beat world champion Lee Sedol, from South Korea.
Instead of considering millions of possibilities, the program’s algorithm used a more restricted strategy. By performing a statistical analysis of data from previous matches contested between the game’s best players, the program identified the most common and efficient moves, resulting in a smaller set of variables, and was soon able to beat the human players. But there was more to come. Last year, DeepMind developed a new program, AlphaGo Zero, which outperformed the original AlphaGo. And this time the machine did not learn from humans, but from the previous versions of the program.
There are a growing number of practical applications for this type of technology. Artificial intelligence algorithms developed by computer scientist Anderson de Rezende Rocha, a professor at the Institute of Computing of the University of Campinas (UNICAMP), have been used to assist police investigations. Rocha specializes in computer forensics and creates artificial intelligence tools to detect subtle details in digital documents that are often imperceptible to the naked eye. “The technology can help the experts confirm that a particular photograph or video related to a crime is genuine, for example,” says Rocha.
One situation in which the algorithms are being used is to automate investigations into images of child abuse. Police regularly seize large volumes of photographs and videos from the computers of suspects. If there are files related to child abuse, the algorithm helps find them. “We exposed the robot to hours of pornographic videos from the internet to teach it what pornography is,” says Rocha. Then, in order to identify the presence of children, the algorithm needed to “watch” the videos of child abuse seized by the police. “This stage was carried out by police officers. Nobody at UNICAMP had access to this material,” he adds. Rocha says that these types of files were previously analyzed manually in most cases. “Automating the process makes it more efficient, giving the police more time and allowing them to examine more data.”
Programmers should be aware of the implications of their work, says Nick Seaver, from Tufts University
Many computer scientists use mathematical properties, theorems, and logic when working on algorithms, regardless of the immediate purpose of the application. In many situations, the only known algorithms are very inefficient and do not work with large data volumes; the factorization of a number into its constituent primes, for example (something that is very important in cryptography), or routing a welding robot through several weld points. There is little hope that efficient algorithms will be found for these issues, which fall under the as yet unsolved problem of “P versus NP,” considered one of the greatest challenges in both computer science and mathematics.
Although there is more programming involved than basic science in the development of many of the algorithms used in everyday life, advances in knowledge are essential if new applications are to be explored in the future. Marcondes Cesar, from USP, is working on computer vision, a type of artificial intelligence that extracts information from images to simulate human vision. The technique is being explored in various industries, particularly in medical diagnoses. “Computer vision can detect anomalies more accurately and evaluate subtle details in magnetic resonance imaging, for example.”
The aim of the project, in partnership with the USP School of Medicine and the Children’s Institute of the university’s teaching hospital, is to create a mathematical model that can provide a more accurate analysis of the liver and brain in newborns. The models used to interpret magnetic resonance images are generally based on white adult males and created in other countries, which can lead to inaccurate diagnoses in newborn babies in Brazil. But the project’s success depends on certain theoretical problems being solved first. “We do not yet know if we will be able to write an efficient algorithm. We are still studying properties based on graph theory,” he says, referring to the branch of mathematics that studies the relations between objects of a given set, associating them to one another by means of structures called graphs.
The impact of algorithms has also been analyzed in other fields of knowledge. “Algorithms are already playing a moderating role. Google, Facebook, and Amazon have an extraordinary amount of power over what we are exposed to in culture today,” said Ted Striphas, professor of the history of culture and technology at the University of Colorado, USA, and author of the book Algorithmic Culture (2015), which examines the influence of these online giants. American anthropologist Nick Seaver, a researcher at Tufts University, USA, is currently conducting ethnographic research and interviews with the creators of music recommendation algorithms for streaming services. His interest is in how these systems are designed to attract users and draw their attention, studying the interface between areas such as machine learning and online advertising. “The mechanisms that control attention and its technical mediations have become a subject of great interest. The formation of interest and opinion bubbles, as well as fake news, and political distractions, can be attributed to technologies designed to manipulate user attention,” he explains.
Recommendation systems based on algorithms have become key players in the online entertainment industry. In an article published in the journal ACM Transactions on Management Information Systems in 2015, Mexican electronic engineer Carlos Gomez-Uribe described how the algorithms used by streaming service Netflix rank television series and movies according to the individual profile of each user. The aim is to encourage customers to choose a TV show to watch within 90 seconds of logging on—any longer than that and they tend to get frustrated and lose interest. The success of this ranking system gave Gomez-Uribe’s career a real boost, and in 2017 he became head of algorithms and internet technology products at Facebook.
The influence and power held by major internet companies does not depend solely on the creativity of their programmers. It is also linked to the huge volumes of data accumulated and processed by their algorithms, which generates highly valuable information. “What prevents another company from developing an application like Uber? This has already been done, in fact. But the traffic and customer behavior data that Uber has accumulated over time belongs only to them, and it is valuable,” says Marcondes Cesar, from USP.
The recent Facebook user-data leak, which caused the value of the company to fall by US$49 billion last month, revealed a vulnerability that was thought to be uncommon—algorithms used by Cambridge Analytica were able to access the behavioral data of 50 million Facebook users, which was then used to influence political campaigns on social networks, including the Brexit vote and Donald Trump’s ultimately successful bid to become president of the United States. The Facebook case is an example of the ethical challenges created by the widespread use of algorithms, although data misuse and abuse is only part of the problem. Data use has become as important to algorithms as the challenge of actually programming them. “Analyzing the characteristics of the data is fundamental to the construction of an algorithm; a mistake at this stage could lead to biases in the results,” says Marcondes Cesar.
It is also common for algorithms to reproduce biases when they are based on human behavior. The Cloud Natural Language API, a tool created by Google that reveals the structure and meaning of texts through machine learning, has developed its own biases. A test by American website Motherboard showed that when analyzing text to determine if it has a “positive” or “negative” sentiment, the algorithm classified statements such as “I’m a homosexual” and “I’m a gay black woman” as negative. “Programmers who create smart algorithms need to be aware that their work has social and political implications,” says Nick Seaver, from Tufts University. Some undergraduate and graduate computer science courses already offer classes that address computer ethics, including USP in Brazil, and Harvard University and the Massachusetts Institute of Technology (MIT) in the US.
The transparency of advanced algorithms is another hot topic. The details of how these tools work are often kept secret by developers. In some cases, the code is so complex that it is not possible to understand how the algorithm arrives at a decision and what its implications are. Systems like these, which are opaque to external scrutiny, are known as “black box algorithms.” The debate has gained momentum after research into an experimental tool called COMPAS (Correctional Offender Management Profiling for Alternative Sanctions), which is used in the US legal system to make sentencing recommendations and even to predict the risk that a defendant will reoffend. The study, conducted by the ProPublica organization in 2016, revealed that the COMPAS system is 77% more likely to classify black defendants as possible reoffenders than whites. Northpointe, the private company that created the algorithm, declined to share the code. “Algorithms used by public bodies should not be created or developed without the participation of public managers and administrators, as they are not neutral,” says Sérgio Amadeu da Silveira, a researcher at the Center for Engineering, Modeling, and Applied Social Sciences at the Federal University of ABC (UFABC).
In 2017, Kate Crawford, head of research at Microsoft Research, and Meredith Whittaker, leader of Google’s Open Research Group, founded the AI Now Institute, an organization dedicated to understanding the social implications of artificial intelligence. Based at New York University, USA, the institute’s approach involves computer scientists, lawyers, sociologists, and economists. In October, it released a report offering guidelines on the use of artificial intelligence algorithms. One recommendation was that public agencies such as those responsible for criminal justice, healthcare, welfare, and education should not use systems whose algorithms are not well known. The document suggests that black box algorithms should be subject to public auditing and validation tests in order to implement corrective mechanisms when necessary.
Another purpose of artificial intelligence algorithms is to free human beings from repetitive tasks—and there is frequent debate over the implications of AI software on the labor market. “The Future of Employment,” a report published in 2013 by economists Carl Frey and Michael Osborne from the Oxford Martin School, UK, estimated that sophisticated algorithms could soon replace 140 million professional jobs worldwide. The paper mentions examples, such as the increasing automation of decisions made in the financial market and even the impact on the work of software engineers—machine learning and algorithms can improve and speed up various programming tasks. “Procedural intellectual activities that involve repetitive tasks, such as translating documents, have a great chance of one day being executed by computer algorithms,” says Sérgio Amadeu, from UFABC. It is important that we discuss the side effects of artificial intelligence, says Marcondes Cesar, from USP, but for now they are far from outweighing the remarkable contributions made by these algorithms to solving all kinds of problems.
Hoobox Robotics, a company founded by researchers from UNICAMP in 2016, has developed a system for motorized wheelchairs that allows quadriplegics to control the chair using only facial expressions. The algorithm used by the software, which is called Wheelie, translates up to 11 facial expressions, such as a smile or a raised eyebrow, into commands to move forward, backward, left, and right. The program is being tested by 39 patients in the USA, where the company has a research unit at the Johnson&Johnson laboratory in Houston. The system uses a 3D camera to capture dozens of facial points.
“The user can configure a command for each expression. A smile, for example, can move the chair forward, a kiss, back,” explains computer scientist Paulo Gurgel Pinheiro, director of Hoobox. To learn to recognize key expressions, the Wheelie algorithm studied a set of facial data from 103 truck drivers. “We partnered with a transportation company to install cameras in trucks and record the facial expressions of volunteers over three months,” Gurgel explains.
An IME-USP research project is working in collaboration with UNICAMP’s Laboratory of Image Data Science (LIDS) to improve the diagnosis of parasite infections using computer vision. Marcelo Finger, a computer scientist from IME, is testing an algorithm that can identify parasites by analyzing images of stool samples. “We have been able to identify 15 parasites in humans and some in animals, such as cattle, dogs, and cats,” he says. Diagnoses are currently obtained by examining stool samples under a microscope. “A lab worker can usually analyze about six blades at a time. The aim is to automate this process,” says Finger. It seems simple, but because algorithms work by identifying patterns, any background noise creates an obstacle for the researchers. “It is one thing for the algorithm to be able to identify the parasite in a photo from a book; doing the same with an image in which the parasite is surrounded by dirt is quite another,” says the researcher.
Projeta Sistemas, a startup based in Vitória, Espírito Santo State, Brazil, has created an algorithm to help cattle farmers. The system, called Olho do Dono, uses 3D images to estimate the weight of the cows. “The process of weighing cattle is very costly, time-consuming, and involves moving the animals around, which can cause stress and even weight loss,” explains computer scientist Pedro Henrique Coutinho, director of the company. The software was developed based on computer vision techniques, associating the weights of the cattle with images taken by cameras. The system relies on a robust database. “We monitor the weighing of livestock on ranches throughout Brazil. Our algorithm is based on thousands of recorded images,” says Coutinho. Development began in 2015 and the software goes to market in September.
CrowdPet is a smartphone application created by SciPet, a company based at UNICAMP, to help find lost animals. The system uses an algorithm to compare pictures of lost pets provided by their owners with photos of animals on the streets taken by volunteers. “The application can match two images through visual recognition methods, and uses geolocation to locate where the photo of the lost animal was taken,” says Fabio Rogério Piva, director of SciPet. The Animal Control Center in the municipality of Vinhedo, São Paulo, began using the application last year to register animals during welfare campaigns. SciPet has developed a prototype capable of identifying dogs and cats with 99% accuracy.