Would Large Language Models Be Better If They Weren’t So Large?

When it comes to artificial intelligence chatbots, bigger is typically better.

Large language models like ChatGPT and Bard, which generate conversational, original text, improve as they are fed more data. Every day, bloggers take to the internet to explain how the latest advances — an app that summarizes‌ ‌articles, A.I.-generated podcasts, a fine-tuned model that can answer any question related to professional basketball — will “change everything.”

But making bigger and more capable A.I. requires processing power that few companies possess, and there is growing concern that a small group, including Google, Meta, OpenAI and Microsoft, will exercise near-total control over the technology.

Also, bigger language models are harder to understand. They are often described as “black boxes,” even by the people who design them, and leading figures in the field have expressed ‌unease ‌that ‌A.I.’s goals may ultimately not align with our own. If bigger is better, it is also more opaque and more exclusive.

In January, a group of young academics working in natural language processing — the branch of A.I. focused on linguistic understanding — issued a challenge to try to turn this paradigm on its head. The group called for teams to create functional language models ‌using data sets that are less than one-ten-thousandth the size of those used by the most advanced large language models. A successful mini-model would be nearly as capable as the high-end models but much smaller, more accessible and ‌more compatible with humans. The project is called the BabyLM Challenge.

Like this post? Please share to your friends: