Saturday, November 9, 2024

The Next Phone Era Is Almost Here. This Is How It’s Shaping Up

In sci-fi shows like Westworld and The Expanse, characters use evolved mobile devices with sleek transparent designs that can project holograms or magically morph their interface as needed.

Fanciful? Maybe. But smartphones are about to take a leap forward.

Today, tech giants are putting the pieces in place for advancements in artificial intelligence that will push the smartphone into its next era — and soon. Within the next few years, experts say, our phones will be much more intuitive, helpful and even friendly. That shift would not only make phones feel fresh and interesting in ways they haven’t been for years but could also liberate us from being immersed in our screens. 

Just last week, Apple announced the iPhone 16 and iPhone16 Pro, which it’s trumpeting as the first iPhones built for the company’s new AI system, Apple Intelligence. The new phones include a dedicated button for triggering the camera and “visual intelligence,” an AI-fueled camera mode that lets you learn about your environment just by pointing your phone and pushing a button. 

In the future that companies like Apple, OpenAI, Samsung and Google are sketching out, your phone’s camera won’t just be for snapping selfies or documenting your vacation — it’ll be a wider window into the world around you. Apple’s upcoming visual intelligence feature is just one example.

Most smartphones will continue to look like the slabs of glass and metal we carry around today for the foreseeable future (sorry, no holograms or see-through screens yet). But they’ll get better at surfacing information the moment we need it, almost like the adaptable software shown on the devices in The Expanse. Instead of constantly jumping between apps and digging through menus, you may find yourself simply speaking to your phone to get things done. Or better yet, your phone’s screen may show exactly what you need without you even having to ask.

“My very top-down view is that long term, I would like a phone where you would never have to go into the settings menu,” Patrick Chomet, Samsung’s executive vice president and head of customer experience, tells me. “You shouldn’t have to know the name of any feature. And we wouldn’t have to name them, because the device would have full enough intelligence and context to support the actions that you want to do.”

Speaking with tech executives, analysts and futurists, and drawing on my own experience trying new devices and cutting-edge virtual assistants, has painted a picture of where the smartphone is headed.

AI arrives to breathe new life into the smartphone

At a time when the majority of people in the world own a smartphone, the idea of having a supercomputer in your pocket isn’t as novel as it once was. It’s harder than ever to get consumers excited about new phones — as innovative as folding phones are, they haven’t exactly ignited much fervor among consumers. And there’s data to show it. Forty-four percent of smartphone owners upgrade their phone only when it breaks or needs replacing, according to a CNET survey published in September based on data collected by YouGov.

The same survey also indicated that consumers hold onto their phones for three years or longer, aligning with comments Verizon CEO Hans Vestberg made to CNBC in July that subscribers keep their phones for way over 36 months. Although the smartphone market began to rebound this year, sales reached nearly a decade low in 2023, according to Counterpoint.

“The smartphone has become commoditized, so they’re all fairly similar,” said Amy Webb, a quantitative futurist and founder and CEO of the Future Today Institute. “Even though you have two predominant operating systems that are different, they’re also not totally different from each other.”

But following the explosive launch of ChatGPT in late 2022, smartphone makers saw an opportunity for AI to chart a new path for our smartphones. The first wave of generative AI features has largely focused on very specific use cases, such as editing a photo, summarizing a transcript or translating a conversation. Niche as these tools may be, they bring something to the smartphone experience that’s been sorely lacking in recent years: features that feel genuinely new, like the ability to create an image from scratch with just a few taps.

AI Atlas art badge tag AI Atlas art badge tag

Yet this is just the beginning of what tech leaders see as being a more significant breakthrough in the smartphone’s growth. The next stage of AI-fueled updates could lay the groundwork for a future in which we don’t need to open as many apps, menus or services to get things done on our phones.

It’s a transformation that Sameer Samat, Google’s president of the Android ecosystem, and Sissie Hsiao, vice president and general manager of Gemini experiences, have been thinking about a lot — and that’s important considering Android powers nearly 80% of the world’s smartphones, according to Counterpoint Research.

Samat tells me how Google is “rebuilding” Android, with AI being at the center of it all, and of course Gemini — the company’s AI assistant — being the star.

“This is not your traditional assistant anymore,” Hsiao said. “This is really capable of doing new things.”

Zooey Liao/CNET

Phones have eyes (and voices)

Advancements in generative AI have made virtual assistants much smarter and more conversational than the Siris and Alexas of years past. Instead of just answering questions, voice-enabled helpers are getting better at sounding more natural and convincing. 

Nick Turley, OpenAI’s product lead for ChatGPT, thinks speaking to our devices will go from niche to mainstream the same way chatbots have over the past two years.

“A year from now, I would suspect that voice could be the primary way that people use [ChatGPT],” he said.

It’s not just words anymore, either; tech companies and phone makers want our devices to get better at “seeing” our surroundings too. That message was clear during Apple’s annual device event on Sept. 9, where it introduced a new feature coming to the iPhone 16 called visual intelligence

With the push of the iPhone 16’s new Camera Control button, you’ll be able to point the phone at a restaurant and find its hours or scan a flier for an event to add it to your calendar. There are also on-screen buttons for launching a Google search based on your photo or for asking ChatGPT about an image. Based on Apple’s pre-recorded demo, it looks like a new type of visual interface for the iPhone that leans on the camera as the primary means of input, instead of opening an app and typing or swiping. 

Apple is just the latest tech company to explore an idea like this. In May, OpenAI showed how the chatbot could recognize math equations and provide tips in real time, like a virtual math tutor, just by pointing your phone at the problem. Google’s Gemini helper can analyze the contents of a YouTube video to answer questions about it.

For Google, Gemini’s current capabilities are laying the groundwork for a more ambitious rethinking of the virtual assistant: Project Astra. The prototype digital assistant can “see” and “understand” your surroundings using your phone’s camera and combine that data with speech input to process requests.

In a video shown at Google’s developer conference in May, the user pointed a phone at a speaker sitting on someone’s desk, drew an arrow on the phone’s screen pointing at a particular part of the speaker and asked, “What is that part of the speaker called?” Astra responded that it was a tweeter, and explained what that component does. CNET’s Lexy Savvides briefly tried Google’s Project Astra demo at Google I/O and witnessed how it could generate a story based on pictures of animals held in front of the camera. 

001-expanse-insert 001-expanse-insert

Google/Jeffrey Hazelwood/CNET

“[Gemini Live] is really the start of a journey towards fully expressive, multi-modality,” Hsiao said, adding that the first place the tech behind Astra will be widely available is in Gemini on smartphones, although she didn’t specify launch timing. “Already in Gemini Live, you can speak to it; it speaks back to you. We envision taking that with Astra into turning on the camera so it can also see and engage with you in audio full bore.”

It’s all starting to feel more like science fiction, so much so in fact that OpenAI found itself in hot water over accusations that it copied the voice of Scarlett Johansson, who played the astoundingly human-like virtual assistant in the 2013 Spike Jonze movie Her, for one of ChatGPT’s voices.

Saar Gur, general partner at the venture capital firm CRV, who specializes in spotting companies that lean into shifts in consumer behavior, thinks voice interaction as a computing interface is “underestimated” today. He believes speaking with AI agents will increasingly become the norm.

“It will be much more common instead of, ‘let me Google that,’ that this voice [assistant] will enter a conversation you and I are having,” Gur said.

Going a step beyond that, Gur sees an opportunity for AI personas to provide entertainment and companionship in ways they aren’t today. He refers to his teenage son, who frequently chats with other gamers on Discord, as a hypothetical example of how AI avatars can be used for more than just retrieving information.

“Many of his friends are people he’s never met,” he said. “The idea that now there can be chatbots that are actually much safer for him to interact with, because they’re not real people that could then take his password and share it somewhere else.”

And he’s not alone; entrepreneur Avi Schiffmann made headlines in July for developing a smart pendant with an embedded AI assistant called Friend that’s designed specifically for, well, friendship.

Smarter software

For devices with “smart” in their name, today’s phones don’t always feel very intelligent. It’s up to the user to do a lot of the work when it comes to basic tasks, like toggling settings and catching up on notifications.

“For the most part, the interaction, the process is still very manual,” Webb said. “It requires you to look at a screen and type some stuff in.”

Getty Images/Jeffrey Hazelwood/CNET

But tech companies think AI might be the key to solving that problem, and it could prevent us from keeping our noses buried in our phones as often. It’s one of the key tenets of Apple Intelligence, which will start rolling out next month, as evidenced by visual intelligence and other previously announced features. 

Apple’s upgraded Siri is equipped with knowledge about Apple product settings, which should enable it to be a personal IT department and copilot for navigating your device, among other things. Apple’s 13-year-old virtual assistant will also be able to take actions for you within apps and is getting better at understanding the context behind the information stored on your phone, enabling it to answer new types of questions.

One example shown at the company’s Worldwide Developers Conference in June involved asking Siri a question like, “When is my mom’s flight landing?” and having it cross reference email and real-time flight tracking. You would then be able to follow up with a query like, “What’s our lunch plan?” and have Siri extract details from a text message.

ChatGPT, best known for providing conversational, human-like responses today, will also likely evolve to do more things on your behalf, Turley says.

“While ChatGPT has begun to do things like create an image or perform a task, in many cases, [it’s] still giving you text back,” he said. “And I imagine that ChatGPT in five years is out there performing actions for you, on your behalf, rather than just responding.”

Chomet, the Samsung executive, has bold ideas for how AI could make our phones easier to use. His long-term goal is to make it so that users never have to open the settings menu on a Samsung phone ever again. The company’s approach is all about making various “touch points” on Samsung devices — that is, the parts of the operating system we interact with, such as the keyboard and camera — smart enough to predict what the user wants.

He acknowledges ways in which Galaxy phones are already doing this, such as the keyboard, which now includes built-in tools for text translation and rewriting messages. Other aspects of the operating system, like widgets, notifications, the lock screen and settings menu on Samsung phones, are next.

Chomet sees a future in which you might not even have to think about what to do next on your phone.

“[You’d] never have to go to the settings, or you never have to look for the next action,” he said. “You may not need to open [an] app.”

It’s a problem that companies beyond OpenAI, Google, Samsung and Apple are trying to solve. Startup Brain.ai, for example, has created smartphone software that can assemble an interface based on the task at hand rather than ping-ponging between apps.

My colleague Katie Collins saw the technology in action earlier this year at Mobile World Congress during a demo in which Brain.ai CEO Jerry Yue simply asked the phone to book a flight for two people in first class. The phone conjured up the necessary information for the flight selection, booking and payment process, all without having to open and close different apps and windows.

Then there’s Rabbit, the buzzy AI startup that garnered plenty of attention at the CES tech conference in January for its handheld AI voice assistant, the Rabbit R1. When the device launched in April, however, reviewers (including CNET) criticized the device for buggy performance and limited functionality. 

But Jesse Lyu, the company’s founder and CEO, is still convinced the R1 represents a step toward a future in which AI can handle everything for us simply by asking. Next week, the Rabbit R1 is getting a new feature called LAM Playground, which Lyu claims will enable it to answer complex web-based requests that involve stringing multiple ideas together, such as: “Go to Reddit, search for the best recommendations for TVs in 2024, and then go to Best Buy and order it.”

It’s not the smartphone itself that he has a problem with but its app-centric operating system that feels old-fashioned. 

“We’re not saying, ‘Hey, R1 on day one is better [than the] iPhone,” Lyu said in an interview with CNET. “We think it’s really, really wrong to say that, but we firmly believe this app based system is going to disappear in the future.”

He also hasn’t ruled out the possibility of eventually building an actual Rabbit phone powered by AI, although he didn’t say the company was working on one either.

“It’s definitely possible,” he said when asked whether we might see an app-less Rabbit phone in the future, adding that he has “zero regrets on the strategy of the R1.”

AI has big potential… and big problems

But getting to that future won’t be easy. Generative AI is already raising serious questions about whether we’re ready for a world in which tools let you manipulate and create images with the press of a button — like those already available on the latest phones from Google and Samsung.

The Verge and Digital Trends, for example, have shown how features like Google’s Reimagine (for adding objects to images that weren’t there when the photo was taken) and Pixel Studio (for generating images based on a prompt) can be used to create offensive or misleading content. It might be easy to spot an AI-generated image today, but as the technology improves that likely won’t always be the case.

ChatGPT/Jeffrey Hazelwood/CNET

In a comment to CNET, Google said these tools are meant for creativity and are designed to “respect the intent of user prompts,” meaning they may create “content that may offend when instructed by the user to do so.”

“That said, it’s not anything goes,” the statement said. “We have clear policies and Terms of Service on what kinds of content we allow and don’t allow, and build guardrails to prevent abuse. At times, some prompts can challenge these tools’ guardrails, and we remain committed to continually enhancing and refining the safeguards we have in place.”

That’s just one potential issue. There’s also the question of whether the companies behind these AI chatbots and assistants are infringing on copyrights by using web content to train their models. Plus, large language models — the underlying models that power generative AI chatbots — tend to just spew out false information from time to time, making it difficult if not impossible to trust them. During my time testing Gemini Live on the Pixel 9, Google’s chatbot provided wrong answers on more than one occasion.

“It generalizes or makes an inference based on what it knows about language, what it knows about the occurrence of words in different contexts,” Swabha Swayamdipta, assistant professor of computer science at the USC Viterbi School of Engineering, who also leads its Datasets, Interpretability, Language and Learning Lab, said in a previous interview with CNET. “This is why these language models produce facts which kind of seem plausible but are not quite true because they’re not trained to just produce exactly what they have seen before.”

Turley says progress is being made in this area, especially as ChatGPT learns to use external information to handle queries rather than just relying on its own knowledge. But until these models are 100% reliable — and it’s unclear when and if that will happen — he says users should fact-check ChatGPT’s answers on sensitive topics.

“Because even 90% reliability, and we have made a lot of progress on this topic with each model generation, still doesn’t mean that you should blindly trust the AI,” Turley said. 

Then there’s the question of whether the general public even cares about new AI features. Data suggests most people are just fine with the way their phones work today.

A quarter of respondents in CNET’s survey said they don’t find AI features helpful and don’t want to see more integrated into their mobile phone, while 45% said they’re not willing to pay a subscription for AI tools. 

Thirty-four percent said they’re concerned about privacy when it comes to using AI on mobile devices, despite efforts from companies like Apple, Google and Samsung to preserve privacy by running certain AI features locally on the device without sending information to the cloud. For requests that are too demanding to be handled on the device itself, Apple uses a system called Private Cloud Compute, which it claims will boost privacy by only sending data relevant to the specific task at hand to Apple’s servers. Samsung phones also have a switch in the settings menu that allows you to turn off cloud-based processing for its Galaxy AI features. 

Whether new generative AI features are a hit with consumers will depend on how phone makers use the technology to dream up new ways to make the sea of information stored on our devices — from location data to messages — more palatable and useful. Doing so could make the experience more personal and individualized to the specific user, potentially distinguishing AI features on new phones from cloud-based AI models that can be accessed on any device.

“You really have to show why you want to do these things on [the] device,” said Jon Erensen, a senior director and analyst for market research firm Gartner.

Google/Zooey Liao/CNET

Devices of the future

Barring the unexpected — like the era-defining debut of ChatGPT — futurists and tech executives have a sense of the general direction smartphone evolution will take. While their theories differ, there is a common thread. As phones and peripheral devices get smarter and better at understanding our intentions, we’ll find ourselves relying on screens less. And AI, whether it provides new types of interfaces or serves as the connective tissue between our phones and devices of the future, will have a big role to play.

Webb says she’s seeing an “extraordinary number” of new devices, patents and funding rounds involving new tech devices without screens, adding that the phone as we know it may eventually “fade into the background.”

“At the moment, we are at the beginning of a Cambrian explosion of devices and sensors,” she says.

Over the past decade, smartphones have already shifted toward serving as hubs for the myriad connected devices around us, like smartwatches, wireless earbuds, smart rings and connected glasses. That web of devices is a key part of this shift that Chomet and other tech leaders see occurring, in which we no longer need to operate our phones manually in the same way anymore.

In the not-too-distant future, virtual assistants may ambiently linger between devices and answer your request on whatever gadget makes sense. It’s kind of the same idea behind The Expanse’s hand terminals, which are primarily designed to be touchpoints for other sensors and devices in the user’s environment.

“It’s not linked to a device,” Chomet said. “The intelligent agents can pick up your intention, whether you speak it via voice, or I could type the same thing,” he said. 

That may not sound too different from today’s earbuds, which are already equipped with virtual helpers like Siri and Google Gemini. But the scenario Chomet describes involves just speaking freely rather than intentionally thinking about which gadget you’re talking to. Large language models would make it possible to simply say, “What’s that?” when you hear a song playing in a coffee shop instead of having to say something like, “Hey Google, what song is this?” Chomet says.

Ian Khan, a tech consultant and host of the Amazon Prime video series The Futurist, also thinks we’ll increasingly be surrounded by more smart devices, like connected glasses or even smart jewelry. Smart glasses in particular are already starting to show promise, especially Meta’s second-generation Ray-Bans, which my colleague Scott Stein called “the best AI companion.” Google’s Project Astra demo also raised questions about whether it’s time for Google Glass, the company’s camera and mic-equipped glasses from 2012, to make a comeback.

“It’s funny, because it’s like the perfect hardware,” Google co-founder Sergey Brin said on the subject when speaking to a group of reporters that included CNET at this year’s I/O conference. “It’s like the killer app now, 10 years later.”

Google/Zooey Liao/CNET

But if 2024 has taught us anything, it’s that we’re not quite ready for a world completely beyond smartphones just yet. Attempts to create new, voice-first devices built around AI, such as the Rabbit R1 and Humane AI Pin, products that were seemingly inspired by some of the most beloved fictional gadgets like the Pokédexand the Star Trek communicator badge, were widely panned for not living up to expectations at launch, although they’ve both been updated significantly since then. While futurists like Khan and Webb see a world in which we’re less glued to glowing rectangles, the traditional smartphone likely isn’t going away anytime soon.

“A lot of those changes going forward are going to be about what’s on the inside,” Webb said.

However, a world in which our phones better understand our intentions and prevent us from ping-ponging back and forth between apps? That future is almost already here, according to Chomet, who predicts that shift could occur in the next one to three years. At that point, generative AI might not even seem as novel as it is today, and will instead feel more like a basic yet essential utility.

“You don’t say, ‘my phone has internet,’ or ‘my computer is internet-powered,'” Chomet said. “So I think within a year, AI will be like that.”


Visual Designers | Zooey Liao, Cole Kan

Senior Motion Designer | Jeffrey Hazelwood

Creative Director | Viva Tung

Video Producer | Jesse Orrall

Video Executive Producer | Andy Altman

Project Manager | Danielle Ramirez

Director of Content | Jonathan Skillings

Editor | Corinne Reichert


Related Articles

Latest Articles