Why GenAI might pose a risk to itself

In 2017, Facebook's AI bots Alice and Bob developed their own language, leading to the experiment's halt and sparking global AI concerns. A 2023 study highlights that large language models (LLMs) trained on synthetic data can suffer from Model Aut...

By ET Online | Aug 23, 2024, 01.18 PM IST

In 2017, Facebook introduced two experimental AI bots, Alice and Bob, to negotiate trades involving valuable items. As the bots learned and evolved, they created their own language to communicate, which caused concern among engineers. The team overseeing the experiment chose to shut down the bots when their conversations became incomprehensible. This unsettling development captured the attention of the global AI community.

MAD-ness syndrome
A peer reviewed paper published in 2023 by a group of researchers at Rice and Stanford Universities provides an indication of what might have happened to Alice and Bob. The paper talks about GenAI’s susceptibility to go MAD (model autophagy disorder) without sufficient backstops.

A large language model (LLM) evolves by continuously processing both real and synthetic data, though it often relies heavily on the latter, as seen with many popular models. When LLMs are trained on synthetic data—potentially biased content generated by AI—the model may deteriorate over time or, as researchers describe it, could develop MAD (Model Artifact Degradation).

Shankar J, a brain-computer interface researcher, dives deep into domain specifics like neural networks, data analysis and electronics design as part of his coursework at the National Institute of Technology in Calicut. Shankar says models trained using GenAI content tend to amplify their own biases and errors. Degradation of models, reduced novelty, bias amplification and hallucination, he says, are some common symptoms of MAD-ness.

A study by Gartner finds that by the end of 2024, 60% of the data used for developing AI and analytics will be synthetic. Shankar says some popular LLMs including GPT3, GPT-4, BERT, Claude and LLaMA and image generation models such as DALL-E and stable diffusion are prone to MAD.

The Stanford paper draws analogies to mathematical concepts (contraction mapping, unstable feedback loops) and biological phenomena of mad cow disease, pointing to how synthetic data training models could risk GenAI going berserk.

How soon a model degrades, the researchers say, depends on the number of iterations of training solely on synthetic data, on how complex the task or application is (more complex ones will degrade faster), and the model architecture.

Decreasing quality
Another research paper published last Feb and co-written by five researchers at King Abdullah University of Science and Technology, University of Macau and the China-based Ant Group share the same concerns. The paper titled ‘Autophagy makes large models achieving local optima’ says when LLMs rely on AI content for ‘refined’ learning, they tend to change the content. For text, they might alter the style or add details. In the case of images, some features might be changed. This, according to the research, is sufficient to cancel out the diversity in the data used to train future AI models. It also might affect the variety of information people are exposed to. This trend limits next generation model performance.

“Given the decreasing quality of Gen AI-generated content (blogs, articles, LinkedIn posts), Gen AI has very likely entered this mad phase now,” says Robin Alex Panicker, software entrepreneur and coder.

There are studies that show how AI content generated for websites and social media posts could lack quality without people editing them. Some affected aspects are SEO, quality of content (use of same info repeatedly with minor syntax changes, regurgitated info). Google now has policies in place to push expert backed content. Unedited, un-intervened AI content could see a drop in search rankings. Experts say blogs and websites that leaned heavily on GenAI for content have seen big drops in traffic and ranking since Google’s policy revisions in March.

(With TOI inputs)

Download
The Economic Times Business News App for the Latest News in Business, Sensex, Stock Market Updates & More.

Why GenAI might pose a risk to itself

In 2017, Facebook's AI bots Alice and Bob developed their own language, leading to the experiment's halt and sparking global AI concerns. A 2023 study highlights that large language models (LLMs) trained on synthetic data can suffer from Model Aut...

Related Articles

READ MORE:

More from our Partners

Popular Categories

Hot on Web

In Case you missed it

Top Searched Companies

Latest News

Download ET APP

Follow us on

become a member