ChatGPT Update Enables Chatbot to “See, Hear and Speak” with Users

The upgrade rolls out today, only for the subscription-based version of ChatGPT.

byDeeba Ahmed

September 26, 2023

3 minute read

OpenAI’s flagship chatbot, ChatGPT, is getting a major tweak that will transform how users interact with it.

Reportedly, the world-famous chatbot is getting a brand-new open-source text-to-speech model that can quickly generate a human-like voice from a brief speech sample.

It will now be able to see, hear, and speak, becoming more intuitive.

The voice feature is available in the English language for now.

History was created last November when OpenAI launched its flagship chatbot, ChatGPT. The product took the internet world by storm, making generating text and solving queries hassle-free.

The AI firm has been improving its features, and now it has announced adding a voice identification capability dubbed Whisper to the already feature-rich chatbot. For your information, Whisper is the chatbot’s speech recognition system that can perform various functions, such as settling a debate or creating a bedtime story. OpenAI has upgraded the app to use its reasoning skills for answering queries about images.

For example, it can offer insight into various image types, including documents containing pictures or text, photos, and screenshots. It can also troubleshoot when inquired about any issue, such as why the grill isn’t working or how to plan a meal. Moreover, the chatbot can analyze complex graphs.

“You can now use voice to engage in a back-and-forth conversation with your assistant,” OpenAI explained when announcing the new iOS and Android upgrade.

ChatGPT is tweaked with a new open-source text-to-speech model and will now generate human-like audio after listening to a few seconds-long sample speech or text. This means the world’s favorite chatbot will become chattier now.

Users can directly send audio queries; the app will respond with a synthesized voice. Users can choose between five different voices to interact with the chatbot. To create these voices, OpenAI collaborated with professional voice artists.

In addition, the new version will feature visual smarts, allowing users to upload/snap a photo from ChatGPT, and it will quickly generate a description of the image, just like it happens with Google Lens. Users can also interact with the app regarding selected images or parts of an image through the newly added drawing tool using voice commands. OpenAI has added audio/visual data feed into its chatbot’s machine-learning model to make it as smart as a human.

Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with it on the go, request a bedtime story, or settle a dinner table debate.

Sound on 🔊 pic.twitter.com/3tuWzX0wtS
— OpenAI (@OpenAI) September 25, 2023

The addition of such a wide range of speaking and image-analyzing features indicates that OpenAI is working constantly on its AI models to make them more intuitive and interactive. ChatGPT can easily compete with Amazon’s Alexa or Apple’s Siri with the new features and may even become much more attractive to consumers than other AI chatbots with its rich data feed.

ChatGPT’s new voice generation tech will also allow other companies to license this technology as it has been created in-house by OpenAI. For instance, Spotify can use OpenAI’s speech synthesis algorithms to offer a podcast translation feature, delivering the content in different languages but in the original podcaster’s voice.

The new features will be available today only for the subscription-based version of ChatGPT, at $20 per month. It will be available in all the markets where ChatGPT is available, but currently, the voice feature will work in English.

The Latest

Top Crypto Wallets of 2025: Balancing Security and Convenience

NSA and Global Allies Declare Fast Flux a National Security Threat

Hacker Claims Twilio’s SendGrid Data Breach, Selling 848,000 Records (UPDATED)

Hackers Exploit Stripe API for Web Skimming Card Theft on Online Stores

ChatGPT Update Enables Chatbot to “See, Hear and Speak” with Users

Secure Ideas Achieves CREST Accreditation and CMMC Level 1 Compliance

Brinker Named Among “10 Most Promising Defense Tech Startups of 2025”

SquareX Discloses Browser-Native Ransomware that Puts Millions at Risk

G2 Names INE 2025 Cybersecurity Training Leader

ChatGPT Update Enables Chatbot to “See, Hear and Speak” with Users

RELATED ARTICLES

Related Posts