Thursday, November 14, 2024

Google begins rolling out voice capabilities in Gemini with Gemini Live

Google is trying to make its AI assistant Gemini more useful by adding a conversation mode called Gemini Live, similar to how conversations in ChatGPT work.

Gemini Live has a voice mode, so that users can speak their questions out loud rather than typing. This voice mode works even when the app is in the background or the phone is locked, which allows conversations to happen even when the user isn’t directly interacting with the Gemini app. 

According to Google, users can also interrupt Gemini as it is reading its response out to ask follow-up questions. 

“For years, we’ve relied on digital assistants to set timers, play music or control our smart homes. This technology has made it easier to get things done and saved valuable minutes each day. Now with generative AI, we can provide a whole new type of help for complex tasks that can save you hours. With Gemini, we’re reimagining what it means for a personal assistant to be truly helpful. Gemini is evolving to provide AI-powered mobile assistance that will offer a new level of help — all while being more natural, conversational and intuitive,” Sissie Hsiao, vice president and general manager of Gemini experiences and Google Assistant, wrote in a blog post

Users can select from 10 different voices with different styles and tones, such as calm, bright, or engaged. 

It has begun rolling out in English to Gemini Advanced subscribers on Android, which is a subscription that costs $19.99 per month, though Google does offer a one month trial. The company said that within the next few weeks it will roll out to other languages and iOS as well. 

In addition, Google said that Gemini will be the default assistant on Pixel 9 phones, which were also announced yesterday. “While AI unlocks powerful new capabilities, it also presents new challenges,” Hsiao wrote. “Ironically, using large language models that can better interpret natural language and handle complex tasks often means simple tasks take a moment longer to complete. And while generative AI is flexible enough to complete a wide array of tasks, it can sometimes behave in unexpected ways or provide inaccurate information … Today, we’ve arrived at an inflection point where we believe the helpfulness of an AI-powered assistant far outweighs its challenges.” 

Google also revealed that in the next couple weeks it is also introducing new Gemini extensions for Keep, Tasks, Utilities, and advanced YouTube Music features.

“Let’s say you’re hosting a dinner party: Have Gemini dig out that lasagna recipe Jenny sent you in your Gmail, and ask it to add the ingredients to your shopping list in Keep. And since your guests are your college friends, ask Gemini to ‘make a playlist of songs that remind me of the late ‘90s.’ Without needing too many details, Gemini gets the gist of what you want and delivers,” Hsiao wrote. 


You may also like…

OpenAI starts rolling out advanced Voice Mode to ChatGPT Plus users

Gemini improvements unveiled at Google Cloud Next

Related Articles

Latest Articles