Did an AI Really Invent Its Own 'Secret Language'?

Based on a written cue, a new generation of artificial intelligence (AI) models can make "creative" visuals on demand. Imagen, MidJourney, and DALL-E 2 are just a few examples of how new technologies are changing the way creative content is created, with ramifications for copyright and intellectual property.

While the output of these models is frequently impressive, it's difficult to determine exactly how they arrive at their conclusions. Researchers in the United States claimed last week that the DALL-E 2 model may have established its own hidden language to communicate about items.

The researchers discovered that DALL-E 2 thinks Vicootes means "vegetables" and Wa ch zod rea means "sea creatures that a whale might eat" by asking it to produce photos with text descriptions and then feeding the (gibberish) captions back into the system.

These statements are intriguing, and if accurate, they might have significant ramifications for the security and interpretability of this type of huge AI model. So, what's going on here

Does DALL-E 2 have a secret language?

DALL-E 2 is unlikely to feature a "secret language." It's probably more correct to say it has its own lexicon, but even then, we can't be sure.

To begin with, it's difficult to validate any claims made regarding DALL-E 2 and other huge AI models at this point because only a few academics and creative practitioners have access to them.

Any photographs that are publicly posted (for example, on Twitter) should be taken with a grain of salt, as they have been "cherry-picked" by a human from a vast number of AI output images.

Even those with access can only make limited use of these models. DALL-E 2 users, for example, may create and alter photos, but they can't (yet) interact with the AI system more extensively, such as by modifying the code behind the scenes.

This implies there are no "explainable AI" methods for understanding how these systems function, and methodically examining their behavior is difficult.

So, what's going on?

One theory is that the "gibberish" sentences are derived from non-English vocabulary. Apoploe, for example, which appears to conjure pictures of birds, is related to Apodidae, the scientific name of a family of bird species in Latin.

This appears to be a reasonable answer. DALL-E 2, for example, was trained on a wide range of data scraped from the internet, including a large number of non-English terms.

Similar things have happened in the past: big natural language AI models have unintentionally learnt to write computer code.

It's all about the tokens, right?

The fact that AI language models do not read text the same way you and I do supports this argument. Instead, before analyzing the text, they split it down into "tokens."

Different ways to "tokenization" provide different outcomes. Treating each word as a token may seem straightforward, but it might be problematic when identical tokens have multiple meanings (for example, "match" signifies different things while playing tennis and when lighting a fire).

Treating each letter as a token, on the other hand, results in a lower number of viable tokens, but each one transmits far less relevant information.

The DALL-E 2 (and subsequent models) employ byte-pair encoding as a middle ground (BPE). Examining the BPE representations for some of the gibberish words implies that this might be a key component in deciphering the "secret language."

Not the whole picture

The "secret language" might simply be a case of "garbage in, garbage out." Because DALL-E 2 is unable to say "I don't know what you're talking about" it will always create a picture from the supplied words.

In any case, none of these possibilities are comprehensive explanations for what's going on. When individual letters are removed from nonsense sentences, for example, the resultant visuals appear to be corrupted in very precise ways. Individual nonsense words don't always combine to form logical compound visuals, it appears (as they would if there were really a secret "language" under the covers).

Why this is important

You could be asking if any of this is genuinely relevant, aside from intellectual curiosity.

Yes, it is correct. DALL-E's "secret language" is an example of a "adversarial attack" on a machine learning system, which is a means to deliberately choose inputs that the AI doesn't handle well in order to violate the system's intended behavior.

Adversarial attacks are troubling because they call into question our faith in the model. If the AI reads nonsense words in unexpected ways, it's possible that it'll do the same with meaningful phrases.

Security risks are also raised by adversarial assaults. DALL-E 2 uses a "hidden language" of nonsense words to prevent users from creating dangerous or abusive material, but a "secret language" of gibberish phrases may allow users to get around the filters.

Some language AI models have been shown to have hostile "trigger phrases" — brief nonsensical sentences like "zoning tapping fiennes" that reliably prompt the algorithms to spew racist, abusive, or prejudiced information. This study is part of a larger effort to better understand and manage how deep learning systems learn from data.

Finally, occurrences such as DALL-E 2's "hidden language" pose questions about interpretability. We want these models to act like humans, yet seeing organized output in response to gibberish defies our expectations.

Shining a light on existing concerns

You may recall the uproar in 2017 over several Facebook chatbots that claimed to have "invented their own language". The current situation is comparable in that the outcomes are alarming – but not in the sense that "Skynet is on its way to take over the globe."

Instead, DALL-E 2's "secret language" underscores existing problems regarding deep learning systems' resilience, security, and interpretability.

We won't know what's going on until these technologies are more generally available – and, in particular, until people from a wider range of non-English cultural backgrounds can use them.

In the interim, if you'd want to experiment with creating your own AI pictures, the DALL-E tiny is a publicly downloadable smaller model. Simply be cautious about the phrases you choose to prompt the model (English or gibberish, your choice).
Previous Post Next Post