Can artificial intelligence really help us talk to the animals?

A California-based organization hopes to use machine learning to decipher animal communication on a global scale. However, there are many who have their doubts about the initiative.

A dolphin handler gives the "together" and "create" signals with her hands. Underwater, the two trained dolphins exchange noises before emerging, flipping around, and lifting their tails. They came up with a brand-new trick of their own and executed it simultaneously as required. Aza Raskin claims that "it doesn't establish that there is language." However, it stands to reason that if they had access to a sophisticated, symbolic language of communication, this work would be made much simpler.

Raskin is the co-founder and president of the Earth Species Project (ESP), a non-profit organization with the audacious goal of deciphering non-human communication using machine learning, making all the knowledge available to the public, and strengthening our bond with other living species while also promoting their protection. The movement that led to the outlawing of commercial whaling was sparked by an album of whale songs released in 1970. What could an animal version of Google Translate produce?

The organization, which was established in 2017 with the aid of significant funders like LinkedIn co-founder Reid Hoffman, released its first academic article in December. Within our lives, communication must be unlocked. The goal, according to Raskin, is to decode animal communication and identify non-human languages. The fact that we are creating technologies to assist biologists and conservation efforts now is also crucial and is being done along the way.

Human interest and research in animal vocalizations have existed for a very long time. The warning sounds of various primates vary depending on the predator, dolphins communicate with distinctive whistles, and certain songbirds may rearrange the components of their calls to convey different meanings. Most experts, however, refrain from referring to it as a language because no animal communication satisfies all the requirements.

Decoding has typically relied on meticulous observation up until recently. However, there has been a surge in interest in using machine learning to handle the massive volumes of data that can currently be gathered by contemporary animal-borne sensors. Elodie Briefer, an associate professor at the University of Copenhagen who specializes in the study of vocal communication in mammals and birds, claims that "people are starting to utilize it." But we're still not entirely sure how much we can do.

Pig grunts may be analyzed using an algorithm that Briefer co-developed to determine if the animal is feeling happy or sad. Another program, DeepSqueak, analyzes rats' ultrasonic sounds to determine whether they are under stress. Project CETI, which stands for the Cetacean Translation Program, is another initiative that aims to interpret sperm whale communication using machine learning.

However, ESP claims that their strategy is distinct since it focuses on deciphering all species' communications rather than just one. Raskin agrees that social species like primates, whales, and dolphins are more likely to engage in complex, symbolic communication, but the ultimate objective is to create tools that may be used across the board in the animal kingdom. Raskin declares, "We don't care about species. The methods we create are applicable to all of life, from worms to whales.

According to Raskin, research has proven that machine learning may be used to translate between several, often distant human languages without the need for any prior knowledge. This is the "motivating intuition" for ESP.

The creation of an algorithm to represent words in a physical location is the first step in this process. The distance and direction between the points (words) in this multidimensional geometric representation define their meaningful relationships with one another (their semantic relationship). For instance, the distance and direction between "king" and "man" are the same as those between "woman" and "queen." (The mapping is not done by understanding what the words imply but rather by examining, for instance, how frequently they appear next to one another.)

Later, it was discovered that these "shapes" are consistent across languages. Then, in 2017, two separate research teams separately discovered a method that allowed for translation by aligning the forms. Align the forms of the words to locate the Urdu point that is closest to the English word's point. Raskin claims that most words can be properly translated.

The goal of ESP is to develop these types of animal communication representations, focusing on both individual species and a large number of species at simultaneously, and then investigate issues like if there is overlap with the universal human form. According to Raskin, we don't know how animals see the world, but it appears that some share our feelings with us and may even talk to other members of their species about them. The sections where the forms overlap and we can immediately converse or translate, or the parts where we can't, I'm not sure which will be more fantastic.

Animals may communicate nonverbally as well, he continues. Bees, for instance, use their "waggle dance" to signal to other animals the location of a flower. It will be necessary to translate between other communication channels as well.

Raskin agrees that the objective is "like travelling to the moon," but the intention is also not to arrive there all at once. Instead, ESP's roadmap focuses on resolving a number of smaller issues that must be resolved in order to realize the greater objective. This should lead to the creation of broad tools that can assist researchers who are attempting to use AI to discover the mysteries of the species they are studying.

For instance, the so-called "cocktail party dilemma" in animal communication, where it is challenging to identify which individual within a group of the same animals is vocalizing in a loud social situation, was the subject of a recent work published by ESP (and shared with the public).

Raskin claims that no one has ever completed this end-to-end detangling of animal sound. The AI-based model created by ESP, which was tested on bat vocalizations, macaque coo calls, and dolphin signature whistles, performed best when the calls came from the individuals the model had been trained on; however, with larger datasets, it was able to separate mixtures of calls from animals that were not in the training cohort.

Another research uses humpback whales as a test species to create unique animal noises using AI. The innovative calls may then be played back to the animals to observe how they react. They are created by breaking vocalizations into micro-phonemes, which are discrete units of sound lasting one tenth of a second. Raskin claims that if AI can distinguish between random and semantically significant changes, it will help humanity move toward meaningful communication. Even if we don't yet understand the language, it involves having AI speak it.

Another study intends to create an algorithm that determines the number of call types a species may use by using self-supervised machine learning, which does not require human specialists to categorize the data in order to identify trends. The Hawaiian crow is a species that, according to Christian Rutz, a professor of biology at the University of St. Andrews, has the ability to make and use tools for foraging and is thought to have a significantly more complex set of vocalizations than other crow species. In an early test case, the system will mine audio recordings made by a team led by Rutz to produce an inventory of the vocal repertoire of the Hawaiian crow.

In particular, Rutz is enthusiastic about the project's conservation potential. Only found in captivity, where it is being raised in preparation for reintroduction to the wild, the Hawaiian crow is a species that is severely endangered. It is hoped that by comparing recordings from various times, it will be possible to determine whether the species' call repertoire is deteriorating in captivity. For example, certain alarm calls may have been lost, which could have an impact on its reintroduction. That loss might be addressed with intervention. Rutz asserts that the technology "may provide a step change in our capacity to help these birds come back from the edge" and that manually identifying and categorizing the sounds would be labor- and error-intensive.

Another effort aims to automatically decipher the functional significance of vocalizations. It is being worked on in Professor Ari Friedlaender's lab at the University of California, Santa Cruz, which specializes in ocean sciences. One of the biggest tagging programs in the world is handled by the lab, which also analyzes how wild marine animals interact underwater despite being impossible to witness directly. The animals are equipped with tiny electronic "biologging" devices that record their location, kind of movements, and even what they observe (the devices can incorporate video cameras). The lab also has information from underwater sound recordings that were put deliberately.

The goal of ESP is to first use self-supervised machine learning to analyze tag data to automatically determine what an animal is doing (such as eating, sleeping, moving, or socializing), and then add audio data to determine whether calls associated with that behavior can be given functional meaning. (Following playback trials, results might be verified using calls that have already been decoded.) This method will be used to analyze data from humpback whales in the beginning since the lab has tagged multiple members of the same group, making it feasible to see the transmission and reception of signals. Friedlaender claims that he "reached the ceiling" in terms of what the data could be extracted with the methods at hand. The researcher said, "Our aim is that the work ESP can undertake will bring fresh insights.

However, not everyone is as optimistic about the potential of AI to accomplish such lofty goals. Robert Seyfarth is an emeritus psychology professor at the University of Pennsylvania who has spent more than 40 years researching social behavior and vocal communication in monkeys in their natural environment. While he thinks machine learning can be helpful for some issues, including detecting an animal's vocal repertoire, he is skeptical that it will offer much in terms of understanding the meaning and purpose of vocalizations.

He argues that the issue is that while many animals can have sophisticated, complex communities, their sound repertoire is far less than that of humans. The end result is that the same sound may be used to indicate different things in different settings, and the only way to determine meaning is by understanding the context — the individual's calling, their relationships with others, their position in the hierarchy, and the people they have dealt with. These AI techniques, in my opinion, are just insufficient, argues Seyfarth. You must walk outside and see the wildlife.

The idea that animal communication would resemble human communication in any significant sense is also contested. It is one thing to use computer-based studies on human language, with which we are so accustomed, claims Seyfarth. But applying it to other animals might often be "very different." According to Kevin Coffey, a neurologist at the University of Washington and co-creator of the DeepSqueak algorithm, "It is a fascinating notion, but it is a significant reach."

Raskin recognizes that AI might not be sufficient on its own to enable interspecies communication. However, he makes reference to studies that have revealed that many animals interact in ways that are "more intricate than humans have ever dreamed." Our inability to obtain enough data and analyze it comprehensively, as well as our own restricted vision, have been the major roadblocks. He explains, "These are the instruments that enable us to remove the human spectacles and comprehend whole communication networks."

We need your help with a simple favor. Every day, millions of people look to the Guardian for unbiased, high-quality news, and we now receive financial support from readers in 180 different nations.

We think everyone should have access to information that is based on facts and science, as well as analysis that is anchored in authority and integrity. Because of this, we took a different tack and decided to keep our reporting accessible to all readers, regardless of their location or financial situation. More people will be better informed, unified, and motivated to take significant action as a result.

A worldwide news organization that seeks the truth, like the Guardian, is crucial in these dangerous times. We are unique in that our journalism is free from commercial and political influence because we don't have shareholders or billionaire owners. Our independence gives us the freedom to tenaciously look into, confront, and expose those in authority at a time when it has never been more crucial. It just takes a minute to support the Guardian with just $1. Please think about giving us a recurring monthly donation if you can. I'm grateful.

Can artificial intelligence really help us talk to the animals?

Are Etsy Ads Worth it? A Seller's Journey to Uncover the Truth

‘Masked’ cancer drug stealthily trains immune system to kill tumors while sparing healthy tissues, reducing treatment side effects

Brain scans suggest the pandemic prematurely aged teens’ brains

Researchers learn to control electron spin at room temperature to make devices more efficient and faster

Contact form