Programmed to see : Revista Pesquisa Fapesp

What can one learn from animals as distinct as a salamander, a cat or a spider? Scientists from the University of São Paulo in São Carlos are learning to see. The study of the biological model has been indispensable for uncovering the secrets of one of the most complex attributions of the brain, namely vision, which afterwards they intend to teach to a computer. There are advances both in the theoretical and in the practical. We could still be a far distance from Hal, the witty computer of Clark’s Space Odyssey 2001, or even the almost human androids of Blade Runner , but the research group into Cybernetic Vision have already exhibited virtual neurons that create their own life within the virtual environment and that can be controlled in accordance with the rules inferred from research with real cells. Now, this is the way to carry out the reverse: to understand, from a mathematical model, how the natural neurons integrate among themselves, a task almost impossible by ordinary means, as the neurons in the laboratory can only be studied in isolation.

The accumulated knowledge of the team has served for the development of applied projects. It has already permitted the construction of a mechanical eye, finished up last year, based on the visual system of a spider. In medicine, it assists in the development of a system for the diagnosis of leukemia, which should be completed in four years. The coordinator of the group, Dr. Luciano da Fontoura Costa, an electrical engineer with specialization in physics, has acted as a consultant for both national and foreign corporations. Two examples: to Hewlett Packard of Brazil, he created a quality control system for video monitors, and with Intelligent Network, from the United States, a program for recognition of patterns and artificial intelligence in computer networks and on the Internet.

Oranges and apples
The practical applications of computational vision, in the most diverse sectors, are immense and attract the interest of companies and universities throughout the world. Abroad, it is estimated that this area has a turnover of close to US$ 5 billion, even when the mechanisms of the computerized vision still suffer from limitations. For example, machines for visual inspection can already tell broken biscuits from whole biscuits, but still have difficulty in distinguishing objects of similar shapes such as oranges from apples or male from female faces.

Conscious of the strategic importance of this area of research, the group based in São Carlos has consolidated itself as one of the most useful in this area in Brazil, as it is multidisciplinary . The team includes specialists in computing, mathematics, physics, and electrical and mechanical engineering. There are also neuroscientists, medical doctors, psychologists and even philosophers. The majority of them have already completed their doctorate, so it might seem weird that a spider or a salamander could teach a group of qualified Homo sapiens something about an inborn and apparently so simple skill such as vision. After all, it seems that it takes less effort to see than doing calculations or making decisions, for example.

Not quite. “There is a tendency to consider vision as a simple process and the image that we see as an direct impression from the world around us.”, says Dr. Costa, who coordinated the recently completed project Research into Cybernetic Vision financed by FAPESP. “Actually, vision is a refined process, which requires almost half of the capacity of the cerebral cortex of a primate and it consists in the passage of a physical image for interpretation. We are speaking of a collection of processes capable of identifying and locating the existing objects in the world from visual information captured by sensors, an ability that not even the most sophisticated machines have managed to master up to now.”

Form and function
Dr. Costa is certain that he will find in nature the solutions to the challenge of teaching a machine to see. “If we compared the nervous system to a machine”, he said “the vision program would be totally codified in the neurons.” So, the first step of the research consists in the understanding and in the classification of the nerve cells. “Before creating a good mathematical-computerized model, we need to understand the anatomy and physiology of these cells and to unveil the relationship between their forms and functions.”, he comments. The human brain has hundreds of types of neurons and everything indicates that these differences are be determined by internal interactions and external interactions to the individual, since our genetic material is not sufficient to justify so vast a number of forms.

The theory is not original. At the end of the 19th century, the Spanish doctor Santiago Ramón y Cajal (1852-1934), a pioneer in the study of neural tissue, attributed the form of the neurons to human intelligence itself . Dr. Cajal got, along with the Italian Camillo Golgi (1843-1926), the Nobel Prize for Medicine and Physiology in 1906 and passed into history as the creator of modern neuroscience, but his interest for the neuronal form did not receive greater attention throughout the 20th century.

In Brazil, the so called neuromorphometry would continue as a field of research practically unheard of until the Cybernetic Vision Group began their studies as to how form could influence the behavior of the neurons. “The definition of the form of a neuron may be directly linked to the establishment of synaptic connections. More complex cells, with greater in-depth roots, connect themselves to a greater number of cells. One of our goals is to obtain parameters that could classify them.”, said Dr. Costa.

The study concentrated itself on two types of neurons, the ganglionic cells and the pyramidal cells, the former of the retina of the cat and the salamander and the latter of the rat. Since the laboratory of the group at the Institute of Physics of São Carlos does not work with animals, the images of the real neurons were sent by research partners, such as the Department of Physiology of the University of Minnesota in the United States, and the Federal University of Rio de Janeiro (UFRJ).

The researchers built the virtual neuron systems from the real images, analyzed by statistical methods and by graphics computer. With the images, they can better understand the function of the neurons, fundamental to vision, and to simulate real situations. This is the so called cybernetic vision, the expression that the researchers adopted for this area to represent the interface between biological vision and computerized vision.

Simulations
However, what they really want is to generate realistic neurons, statistically similar to the natural ones. To get there, the first, and possibly most difficult challenge is to establish patterns of classification. What does a neuron of a ganglionic cell of the retina of a cat do to distinguish it from any other? It was necessary to choose a set of measurements that would represent each group of neurons, such as size, width, orientation and angles of the segments of the dentrites, the ramifications for this type of cell. According to Dr. Costa, the choice of these parameters is still an open question, which should take into account what you want to look into.

The question of the extraction of the measurements, the key point in the creation of the virtual nervous system, is also fundamental for the development of a visual cybernetic mechanism. From any scene, the computer will have to capture the image that interests it, as does the biological eye, and from it extract the attributes (or measurements) necessary for its recognition. The secret is to know what measurements are capable of transforming the image into efficient algorithms (mathematical models or expressions) of identification. For example, in the middle of a group of people what makes men and women statistically different form one another? Height, hair length, facial angles? What to the animal seems to be instinctive, the computer will have to learn step by step.

The spider eye
Developing a more efficient system of recognition was the objective of one of the first practical pieces of work carried out by the Cybernetic Vision Research Group, through the construction of a mechanical eye. The work began in 1993 when Dr. Costa came back from his doctorate at King’s College in London, and was finished up last year. The model used was the visual system of various species of jumping spiders of the family Salticidae, chosen for having a well evolved system of vision. According to Dr. Costa, this species possesses the most developed visual system amongst the terrestrial vertebrates and, if you include aquatic creatures, only loses to the octopus. Besides this, it manages to detect segments of a straight line, an ideal ability for the development of a mechanical eye, as it can the storage of information easier.

Selective look
The work began with the observation of the behavior of the spider. Imprisoned in front of a computer screen, It watched the image of hypothetical prey and predator and the researchers observed its reactions. Dr. Costa now recognizes that this type of approach was flawed as the human observation is subjective, or that is to say, different observers could arrive at different conclusions. That is why, shortly afterwards the work became more sophisticated with the inclusion of an electronic stethoscope. Placed on the stomach of the spider, the device measured the abdominal micro movements, where the heart of the animal is located, and obtained a more objective measurement of response to stimuli. The researchers observed that the jumping spider was able to quickly recognize whether the image projected on the screen was of another spider or not.

The retina of the jumping spider, with an elongated shape like that of a boomerang and able to move, while the cornea remains fixed (just the contrary of the human eye), served as a model for the creation of a prototype of a mechanical eye that would detect straight lines efficiently in order to recognize objects right away. This approach also led the interpretation of the recognition of straight lines as being a problem of mathematical optimization.Another practical application of the cybernetic vision research involved experimental applications in railway stations in London, in a study in collaboration with the King’s College.

This was a system of the estimation of the population density, in order to monitor areas where crowds are gathered. Since in a crowd only parts of the bodies appear, the computer does get to recognize individual human beings. The technique is based on differences of patterns of texture. Images of low density of people tend to present a thick texture, while denser images present finer textures. So, the images are classified into different categories of texture. Afterwards, statistics of these classes are used to estimate the number of people.

Computers also choose
Even more surprising is the program that evaluates visual perception, an attribute still little known by scientists and very much influenced by the environment, social-cultural context and emotions. “The image of the TV is sent to receivers with a loss of quality.” exemplified Dr. Costa. “If our perception were linear, it would be enough to measure the original image, take the difference and establish standards of quality.” However, the image can have distortions and yet be agreeable. An adequate model to evaluate the quality of the images would have to be capable of considering the subtleties of the human visual system.

It was exactly this that the computer did. Last year the researchers invited a group of 20 people of different social segments to attribute grade to about 50 images. Beginning with the grades, associated to measurements such as dimensions, contrast and color, the specialists designed an algorithm that permitted the computer to evaluate other images. Following this, they compared the two judgments. Humans and machine valued points such as artistic quality and originality, attributes directly related to the cultural context. The major surprise was that the grades given by the humans and afterwards by the computerized model were very similar.

Quick Response
As the images remained on the screen for only a few seconds, a possible justification for the incredible agreement is that they had been evaluating more primitive measurements. In the human being, the visual information goes first through the base of the occipital lobe, a region of the cerebral cortex located near the nape of the neck and responsible for the recognition of segments of straight lines that make up the primary processing of the visual information. If the image were to stay on the screen for more time, the information would travel to cerebral regions more influenced byemotions and cultural contexts, which could cause distortions in relation to the evaluation of the computer. According to Dr. Costa, this detail does not invalidate a possible application of the experiment, above all for the evaluation of visual content that must provoke an impact or an immediate liking, such as billboards, Internet pages or even television ads. “One of the benefits of this research is the identification of thespecific importance of various visual attributes in the process of perception.”, he said.

Presently, São Carlos’ group is working on a program for the semi-automatic diagnosis of leukemia, in collaboration with the hematologists Dr. Marco Zago, of the School of Medicine of USP in Ribeirão Preto, and Dr. Sérgio Martins, of the Hemocenter also in Ribeirão Preto. The objective is to create software to support the doctor in the process of recognition of abnormal cells, distinctive through the alteration of their format. “First of all, the computer needs to sort out the different kinds of blood cells, recognize the leucocytes and only then carry on with the measurement.”, says Dr. Costa. Today, the diagnosis is made visually – and rather subjectively.

Facilities
Dr. Costa believes that the diagnosis not only will be quicker but will also be more precise and objective. “Certain types of leukemia are associated with very typical morphological alterations such as abnormalities in the shape of the nucleus and cytoplasm of the cell.”, he comments. For this reason, it seems to be possible to swiftly relate the shape of the cell with a type of illness and checking the result with the clinical data, make decisions even more precise regarding the form of treatment.

Recently the São Carlos group joined the Human Cancer Genome Project. One of their tasks, already defined, are to apply processing of signals and images techniques, recognition of patterns, artificial intelligence and the digging out of data in the analysis of the information obtained in this project, financed jointly by FAPESP and the Ludwig Institute. Part of this eight-year tread is gathered into the recently launched Shape Analysis and Classification: Theory and Practice, written by Dr. Costa in partnership with Dr. Roberto Marcondes César Jr., of the Institute of Mathematics and Statistics of USP of São Paulo, at the invitation of the North American publishing house CRC Press. Looking to the future, the two researchers now intend to advance in the research of cybernetic vision with applications in the analysis of microscope images, visual inspection and the recognition of faces, through thematic projects recently signed up and to be developed over the next three years.

The Projects
1. Research into Cybernetic Vision; Modality Support Program to the Young Researcher; Investment R$ 81,300.00 and a further US$ 5,000.00
2. The Development and Evaluation of Original and Precise Methods in the Analysis of Forms and Images and Visual Computation (nº 99/12765-2); Modality Thematic project; Investment R$ 325,000.00 and a further US$ 130,000.00; Coordinator Dr. Luciano da Fontoura Costa – The Institute of Physics of São Carlosof the University of São Paulo

Republish