[go: up one dir, main page]

AI

OpenAI debuts GPT-4o ‘omni’ model now powering ChatGPT

Comment

OpenAI CTO Mira Murati unveiling ChatGPT's advanced voice mode
Image Credits: OpenAI

OpenAI announced a new flagship generative AI model on Monday that they call GPT-4o — the “o” stands for “omni,” referring to the model’s ability to handle text, speech, and video. GPT-4o is set to roll out “iteratively” across the company’s developer and consumer-facing products over the next few weeks.

OpenAI CTO Mira Murati said that GPT-4o provides “GPT-4-level” intelligence but improves on GPT-4’s capabilities across multiple modalities and media.

“GPT-4o reasons across voice, text and vision,” Murati said during a streamed presentation at OpenAI’s offices in San Francisco on Monday. “And this is incredibly important, because we’re looking at the future of interaction between ourselves and machines.”

GPT-4 Turbo, OpenAI’s previous “leading “most advanced” model, was trained on a combination of images and text and could analyze images and text to accomplish tasks like extracting text from images or even describing the content of those images. But GPT-4o adds speech to the mix.

What does this enable? A variety of things. 

Image Credits: OpenAI

GPT-4o greatly improves the experience in OpenAI’s AI-powered chatbot, ChatGPT. The platform has long offered a voice mode that transcribes the chatbot’s responses using a text-to-speech model, but GPT-4o supercharges this, allowing users to interact with ChatGPT more like an assistant. 

For example, users can ask the GPT-4o-powered ChatGPT a question and interrupt ChatGPT while it’s answering. The model delivers “real-time” responsiveness, OpenAI says, and can even pick up on nuances in a user’s voice, in response generating voices in “a range of different emotive styles” (including singing). 

GPT-4o also upgrades ChatGPT’s vision capabilities. Given a photo — or a desktop screen — ChatGPT can now quickly answer related questions, from topics ranging from “What’s going on in this software code?” to “What brand of shirt is this person wearing?”

ChatGPT’s desktop app in use in a coding task.
Image Credits: OpenAI

These features will evolve further in the future, Murati says. While today GPT-4o can look at a picture of a menu in a different language and translate it, in the future, the model could allow ChatGPT to, for instance, “watch” a live sports game and explain the rules to you.

“We know that these models are getting more and more complex, but we want the experience of interaction to actually become more natural, easy, and for you not to focus on the UI at all, but just focus on the collaboration with ChatGPT,” Murati said. “For the past couple of years, we’ve been very focused on improving the intelligence of these models … But this is the first time that we are really making a huge step forward when it comes to the ease of use.”

GPT-4o is more multilingual as well, OpenAI claims, with enhanced performance in around 50 languages. And in OpenAI’s API and Microsoft’s Azure OpenAI Service, GPT-4o is twice as fast as, half the price of and has higher rate limits than GPT-4 Turbo, the company says.

At present, voice isn’t a part of the GPT-4o API for all customers. OpenAI, citing the risk of misuse, says that it plans to first launch support for GPT-4o’s new audio capabilities to “a small group of trusted partners” in the coming weeks.

GPT-4o is available in the free tier of ChatGPT starting today and to subscribers to OpenAI’s premium ChatGPT Plus and Team plans with “5x higher” message limits. (OpenAI notes that ChatGPT will automatically switch to GPT-3.5, an older and less capable model, when users hit the rate limit.) The improved ChatGPT voice experience underpinned by GPT-4o will arrive in alpha for Plus users in the next month or so, alongside enterprise-focused options.

In related news, OpenAI announced that it’s releasing a refreshed ChatGPT UI on the web with a new, “more conversational” home screen and message layout, and a desktop version of ChatGPT for macOS that lets users ask questions via a keyboard shortcut or take and discuss screenshots. ChatGPT Plus users will get access to the app first, starting today, and a Windows version will arrive later in the year.

Elsewhere, the GPT Store, OpenAI’s library of and creation tools for third-party chatbots built on its AI models, is now available to users of ChatGPT’s free tier. And free users can take advantage of ChatGPT features that were formerly paywalled, like a memory capability that allows ChatGPT to “remember” preferences for future interactions, upload files and photos, and search the web for answers to timely questions.

We’re launching an AI newsletter! Sign up here to start receiving it in your inboxes on June 5.

Read more about OpenAI's Spring Event on TechCrunch

More TechCrunch

The first defense startup to receive backing from Y Combinator, Ares Industries, launched earlier this week. In a post on the YC website, the startup outlined a vision to build…

Y Combinator backs its first defense startup, Ares Industries

Pavel Durov, founder and CEO of messaging app Telegram, was arrested on Saturday evening while leaving his private jet at France’s Le Bourget airport, as initially reported by French television…

Telegram founder Pavel Durov arrested in France

The Port of Seattle, which also operates the Seattle-Tacoma International Airport, said it was hit with a “possible cyberattack” that appeared to affect websites and phone systems. The port first…

The Port of Seattle and Sea-Tac Airport say they’ve been hit by ‘possible cyberattack’

Travly is a new social-first discovery and hotel booking platform designed to cater to the growing number of travelers who rely on short-form video content for trip ideas.  The platform…

Travly lets travelers submit videos for a chance to earn a 5% commission from hotel bookings

As AI developers and others start to think more deeply about how computers and people intersect, Stephan Wolfram says it is becoming a much more of a philosophical exercise

Stephen Wolfram thinks we need philosophers working on big questions around AI

Featured Article

The 12 biggest take-private PE acquisitions so far this year in tech

A roundup of the year’s billion-dollar take-private deals in the technology sector.

The 12 biggest take-private PE acquisitions so far this year in tech

Eruditus, an Indian edtech startup, is in advanced stages of talks to secure about $150 million in new funding, two sources familiar with the matter told TechCrunch, in what would…

TPG nears $150M funding in India’s Eruditus at $2.3B valuation

Apple will be unveiling new products on September 10, with the announced phones going on sale on September 20, according to a report from Bloomberg’s Mark Gurman. That lineup will…

Apple reportedly announcing iPhone 16 lineup and more on Sept. 10

Featured Article

The fallout after Bolt’s aggressive fundraising attempt has been wild

After fintech Bolt surprised the industry with a leaked term sheet that revealed it is trying to raise at a $14 billion valuation, things got weird.

The fallout after Bolt’s aggressive fundraising attempt has been wild

Boeing’s Starliner mission is coming back to Earth — empty. After months of data analysis and internal deliberation, NASA leadership announced today that Starliner will be coming back to Earth…

Starliner will return to Earth uncrewed, astronauts staying on ISS until February

A surprising number of “iPad kids” — aka Generation Alpha’s 7- to 9-year-old demographic — are using X, according to new data from parental control software maker Qustodio. The firm…

Do you know where your children are? Maybe on X

This week, Google joined a $250 million deal with the state of California to support California newsrooms. While the deal offers a much-needed cash infusion for an industry that’s seen…

Google just made a $250M deal with California to support journalism — here’s what it means

A court order recently forced Elon Musk’s X to reveal its full list of shareholders, as of June 2023, to the public. Many of the recognizable tech industry names had…

X shareholders as of June 2023 included funds tied to Bill Ackman, Binance, and Sean ‘Diddy’ Combs

Featured Article

VCs are so eager for AI startups, they’re buying into each others’ SPVs at high prices

VCs are increasingly buying shares of late-stage startups on the secondary market as they try to get pieces of the hottest ones — especially AI companies. But they are also increasingly doing so through financial instruments called special purpose vehicles (SVPs). Some of those SPVs are becoming such hot commodities…

VCs are so eager for AI startups, they’re buying into each others’ SPVs at high prices

Featured Article

The top AI deals in Europe this year

Cumulatively, there have been more than 1,700 funding rounds for AI startups in Europe so far in 2024.

The top AI deals in Europe this year

After two years of building the company, the company quietly launched its beta in June and is officially announcing it today, right here, in TechCrunch. 

The founder building a wealth-management product her grandmother would have loved

From the looks of things, companies in the category — including Agility Robotics and Formlogic — can’t hire quickly enough.

These 74 robotics companies are hiring

Automatically disappearing posts on social networks could be handy for users who have a habit of deleting their posts through third-party tools, or if the context of those posts is…

Threads confirms it is experimenting with ephemeral posts

Two former OpenAI researchers who resigned this year over safety concerns say they are disappointed but not surprised by OpenAI’s decision to oppose California’s bill to prevent AI disasters, SB…

‘Disappointed but not surprised’: Former employees speak on OpenAI’s opposition to SB 1047

Neil Mehta, the VC behind the acquisition of a string of properties on San Francisco’s tony Fillmore Street, made waves earlier this week for reportedly throwing long-established local restaurants to…

VC Neil Mehta, who’s quietly nabbing prized SF property, plans a “Y Combinator for restaurants”

RealPage, which makes property management software, was sued Friday by the U.S. Justice Department and eight attorneys general for allegedly helping apartment and building managers around the country collude to…

Justice Department sues RealPage over allegedly helping landlords collude to drive up rents

Colorful Capital’s co-founders, William Burckart and Megan Kashner, declined to comment. 

Colorful Capital will stop trying to raise for a fund

Andrew Ng is stepping down from his role as CEO at Landing AI, the computer vision platform he founded in 2017. Dan Maloney, formerly the COO, will take the reins…

Andrew Ng steps back at Landing AI after announcing new fund

AI models are being applied to every dataset under the sun, but are inconsistent in their outcomes. This is as true in the medical world as anywhere else, but a…

Piramidal’s foundation model for brainwaves could supercharge EEGs

No two businesses are the same, and that’s good news: As we saw again this week, it opens up space for companies to try opposite approaches, join forces or challenge…

M&A can open up the playing field for the competition

Featured Article

Marc Andreessen’s family plans to build a ‘visionary’ subdivision near the proposed California Forever utopia city

Marc Andreessen’s family is planning to build a large housing development near the proposed California Forever city.

Marc Andreessen’s family plans to build a ‘visionary’ subdivision near the proposed California Forever utopia city

EV startup Canoo’s chief technology officer Sohel Merchant has left the company, two people familiar with his departure have told TechCrunch. Merchant was one of the members of Canoo’s founding…

Canoo’s chief technology officer is out amid wider reorg

A company spokesperson for the oil drilling and fracking giant declined to name the executive overseeing cybersecurity, if any.

Halliburton shuts down systems after cyberattack

The move is an effort to squeeze additional revenue from second-hand products, over concerns that cheaper, slightly used bikes, treadmills and rowers could cannibalize used sales.

Peloton adds $95 activation fee for used equipment

Time is running out! These are the last hours to save up to $600 on TechCrunch Disrupt 2024 tickets — offer ends tonight at 11:59 p.m. PT. Join 10,000+ startup…

Last day for massive ticket savings to TechCrunch Disrupt 2024