He Helped Invent Generative AI. Now He Wants to Save It

In 2016, Google engineer Illia Polosukhin had lunch with a colleague, Jacob Uszkoreit. Polosukhin had been frustrated by a lack of progress in his project, using AI to provide useful answers to questions posed by users, and Uszkoreit suggested he try a technique he had been brainstorming that he called self-attention. Thus began an 8-person collaboration that ultimately resulted in a 2017 paper called “Attention Is All You Need,” which introduced the concept of transformers as a way to supercharge artificial intelligence. It changed the world.

Eight years later, though, Polosukhin is not completely happy with the way things are shaking out. A big believer in open source, he’s concerned about the secretive nature of transformer-based large language models, even from companies founded on the basis of transparency. (Gee, who can that be?) We don’t know what they’re trained on or what the weights are, and outsiders certainly can’t tinker with them. One giant tech company, Meta, does tout its systems as open source, but Polosukhin doesn’t consider Meta’s models as truly open: “The parameters are open, but we don’t know what data went into the model, and data defines what bias might be there and what kinds of decisions are made,” he says.

As LLM technology improves, he worries it will get more dangerous, and that the need for profit will shape its evolution. “Companies say they need more money so they can train better models. Those models will actually be better at manipulating people, and you can tune them better for generating revenue,” he says.

Polosukhin has zero confidence that regulation will help. For one thing, dictating limits on the models is so hard that the regulators will have to rely on the companies themselves to get the job done. “I don't think there's that many people who are able to effectively answer questions like, ‘Here's the model parameters, right? Is this a good margin of safety?’ Even for an engineer, it's hard to answer questions about model parameters and what’s a good margin of safety,” he says. “I’m pretty sure that nobody in Washington, DC, will be able to do it.”

This makes the industry a prime candidate for regulatory capture. “Bigger companies know how to play the game,” he says. “They'll put their own people on the committee to make sure the watchers are the watchees.”

The alternative, argues Polosukhin, is an open source model where accountability is cooked into the technology itself. Even before the transformers paper was published in 2017, Polosukhin had left Google to start a blockchain/Web3 nonprofit called the Near Foundation. Now his company is semi-pivoting to apply some of those principles of openness and accountability to what he calls “user-owned AI.” Using blockchain-based crypto protocols as a model, this approach to AI would be a decentralized structure with a neutral platform.

“Everybody would own the system,” he says. “At some point you would say, ‘We don’t have to grow anymore.’ It’s like with bitcoin—the price can go up or down, but there’s no one deciding, ‘Hey, we need to post $2 billion more revenue this year.’ You can use that mechanism to align incentives and build a neutral platform.”

According to Polosukhin, developers are already using Near’s platform to develop applications that could work on this open source model. Near has established an incubation program to help startups in the effort. One promising application is a means to distribute micropayments to creators whose content is feeding AI models.

I remark to Polosukhin that given its reputation, and Web3’s failure so far to take over the internet, crypto might not be the most reassuring analogy for a movement designed to rein in the most volatile technology of our era. “We do need help with marketing,” he admits.

One persistent argument against open source AI is that granting universal access to powerful models might empower bad actors to abuse AI for things like generating misinformation or creating new weapons. Polosukhin says that open systems aren’t really worse than what we have now. “Safety is just a masquerade that limits the functionality of these models,” he says. “All of them are jail-breakable—it’s really not that hard.”

Polosukhin has been proselytizing his idea, talking to scientists around the industry, including some of his “Attention” coauthors. Among the latter, he has found the most resonance from Uszkoreit, with whom he shared that fateful meal in 2016. Uszkoreit, whom I spoke with this week, agrees with much of Polosukhin’s argument, though he isn’t enamored of the moniker. Instead of “user-owned AI,” he’d prefer “community-owned AI.”

He is particularly excited that an open source approach, perhaps with the micropayment system Polosukhin envisions, might provide a way to resolve the tough intellectual property crisis that AI has triggered. The giant companies are now engaged in an epic legal battle with creators whose work is at the heart of their AI models. Since those companies are relentlessly profit-driven, that conflict is built in, and even if they make efforts to compensate creators, those would inevitably be affected by their need to ultimately capture the most value for themselves. That would not be the case with a model built from scratch with the idea of recognizing and rewarding such contributions.

“All these questions about IP would go away if we had a way of actually recognizing the contributions of content creators,” says Uszkoreit. “For the very first time in history, we have an opportunity to quantify what the value of a piece of information is, at least over longer periods of time, to humanity.”

One question I have about the user-owned approach is where the money might come from to develop a sophisticated foundation model from scratch. Right now, Polosukhin’s platform makes use of a version of Meta’s Llama model, despite his reservations. Who would put up a billion bucks to create a totally open model? Maybe some government? A giant GoFundMe? Uszkoreit speculates maybe a tech company just below the top tier might go for it just to thwart the competition. It isn’t clear.

But both Polosukhin and Uszkoreit are certain that if user-owned AI doesn’t appear before the world gets artificial general intelligence—the point where superintelligent AI begins improving AI itself—a disaster will be upon us. If that happens, Uszkoreit says, “we’re screwed.” Both he and Polosukhin think it’s inevitable that AI scientists will create intelligence that is smart enough to improve itself. If the current situation persists, it will probably be a big tech company. “Then you get this runaway effect where suddenly a few corporations, or maybe the one that gets it first, ends up with a money-printing machine that ultimately creates a zero-sum game that sucks the air out of the economy, and that we can't let happen.”

Does Polosukhin, when contemplating worst-case scenarios, ever have second thoughts on his role in leveling up AI? Not really. “Whichever breakthrough happened, would have happened with or without us, maybe at a different time,” he says. “For us to continue evolving, we do need to have a different structure, and that's what I'm working on. If user-owned AI exists, it levels the playing field, and OpenAI and Google and the others are not able to become monopolists—they would be playing the same game, on the same field. It evens out the risks and the opportunities.” You might even call it … transformative.

Time Travel

Earlier this year, I wrote about how the authors of the “Attention Is All You Need” paper invented transformers. The breakthrough helped ignite the generative AI movement. A deceptively random conversation between Polosukhin and Uszkoreit was a key moment in what became a historic achievement of the fateful eight Googlers—all of whom have now left the company.

One day in 2016, Uszkoreit was having lunch in a Google café with a scientist named Illia Polosukhin. Born in Ukraine, Polosukhin had been at Google for nearly three years. He was assigned to the team providing answers to direct questions posed in the search field. It wasn’t going all that well. “To answer something on Google.com, you need something that’s very cheap and high-performing,” Polosukhin says. “Because you have milliseconds” to respond. When Polosukhin aired his complaints, Uszkoreit had no problem coming up with a remedy. “He suggested, why not use self-attention?” says Polosukhin.

Polosukhin sometimes collaborated with a colleague named Ashish Vaswani. Born in India and raised mostly in the Middle East, he had gone to the University of Southern California to earn his doctorate in the school’s elite machine translation group. Afterward, he moved to Mountain View to join Google—specifically a newish organization called Google Brain. He describes Brain as “a radical group” that believed “neural networks were going to advance human understanding.” But he was still looking for a big project to work on. His team worked in Building 1965 next door to Polosukhin’s language team in 1945, and he heard about the self-attention idea. Could that be the project? He agreed to work on it.

Together, the three researchers drew up a design document called “Transformers: Iterative Self-Attention and Processing for Various Tasks.” They picked the name “transformers” from “day zero,” Uszkoreit says. The idea was that this mechanism would transform the information it took in, allowing the system to extract as much understanding as a human might—or at least give the illusion of that. Plus Uszkoreit had fond childhood memories of playing with the Hasbro action figures. “I had two little Transformer toys as a very young kid,” he says. The document ended with a cartoony image of six Transformers in mountainous terrain, zapping lasers at one another. There was also some swagger in the sentence that began the paper: “We are awesome.”

You can read the full feature here.

Ask Me One Thing

Dileep asks, “How do you think students can best use the available AI tools, and how can professors best teach them how to use them? I am encouraging tools such as ChatGPT to help them get a leg up. But they are using it where it is perfectly evident to anyone reading that they have used ChatGPT.”

Thanks for the question Dileep, and also for providing some context. You told me that you teach at a school where many students are from immigrant families and are working to fund their education. It seems you are struggling with the question of whether using generative AI as a shortcut to adequate work is helping them “get a leg up,” as you say, or providing a deceptive impression that they have mastered a genuine skill.

Let’s put aside the uses of ChatGPT to do information-gathering tasks like presenting a big picture on a subject or summarizing long texts—these don’t seem too controversial. (I would say, however, that actually interacting directly with source material provides a deeper understanding that pays off in the long run.) The question for our times is how much students should rely on generative AI to produce their work. In one sense, it’s a repeat of the old argument over whether using calculators to do math work is cheating. The counterargument is that in the real world people no longer have to work things out on paper, and it is ridiculous to pretend that isn’t the case. That latter approach has won the day.

But writing essays based on research and logic is a different situation. Organizing one’s research and presenting it as an argument is an effective way to train the mind to think empirically. Persuasive writing requires you to consider a skeptical reader’s point of view, and forces you to present arguments that are backed by evidence, in a succinct and powerful way. Mastering that skill hones your own mind. One can make the argument that ChatGPT and other products might ultimately negate the need to write essays, as if those LLMs were calculators for prose. But the idea isn’t to provide a bunch of words for some teacher to read—it’s to struggle through whatever process trains a mind to think logically and empathetically.

So the problem isn't that ChatGPT essays are mediocre. Even if the output was fantastic, the fact would remain that there is minimal educational value in this shortcut. If your students want “a leg up” in the world, they must do the difficult work of marshaling their evidence, mastering logic, and expressing those thoughts coherently. This will put them in good stead for everything they do in the workplace and beyond. For some of them this will be really difficult—I used to teach freshman composition at a state school and have seen what a challenge it can be. Also, it will be really hard for you to convince them to ditch this magical new prosthesis called ChatGPT. But the payoff will be worth it.

You can submit questions to mail@wired.com. Write ASK LEVY in the subject line.

End Times Chronicle

Add another victim to the June heat wave: a wax monument to Abraham Lincoln, whose melting head had to be removed. The sculptor molded it to withstand temperatures of 140 degrees. That doesn’t work in 2024.

Last but Not Least

An author explains why she agreed to be an AI bot that chatted with readers of Romeo and Juliet.

Did you think that SimCity was about land use? No, it was a secret libertarian blueprint!

Meta’s Ray-Ban smart glasses promise to crush the language barrier. A trip to Montreal proved otherwise.

Are you celebrating the holiday? Here’s the best fire-pits to gather around. Don’t use them in a national park, please.

Don't miss future subscriber-only editions of this column. Subscribe to WIRED (50% off for Plaintext readers) today.

Time Travel

Ask Me One Thing

End Times Chronicle

Last but Not Least

You Might Also Like …