OpenAI's ChatGPT Evolves into a Verbal Conversationalist


OpenAI, a leading force in artificial intelligence, has announced the addition of new voice and image-based capabilities to its renowned AI assistant, ChatGPT. This signifies a transformation for ChatGPT from a text-based search engine to an interactive assistant and a significant leap in generative AI.

The future is now: OpenAI’s ChatGPT can directly engage in verbal conversations and respond to image-based prompts.

ChatGPT: The Evolution

ChatGPT has been a technological sensation since its launch nine months ago, allowing users to generate essays, poems, and summaries from simple text-based prompts. However, OpenAI has raised the bar by introducing a voice conversation feature and image-based search capabilities, making ChatGPT more interactive.

Users can now engage in voice conversations with the chatbot, asking it to create bedtime stories, answer questions, and even explain what an uploaded image represents. This advancement results from OpenAI successfully integrating its large language models (LLMs) with voice-based assistant technology.

The Technology Behind the Voice Feature

The voice feature of ChatGPT is powered by a novel text-to-speech model that can produce human-like voices from text and a few seconds of sampled speech. OpenAI collaborated with professional voice actors to create five votes, using its open-source Whisper speech recognition system to transcribe spoken words into text.

In a groundbreaking partnership, Spotify has been announced as a launch partner. With this new technology, podcasters can sample their voice and translate their shows from English into Spanish, French, or German, all while maintaining their original voice. This feature, however, is being introduced meticulously, with OpenAI working specifically with selected podcasters, including Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett.

Potential Risks and Precautions

While this new voice technology opens doors to many creative and accessibility-focused applications, it also presents potential risks, such as possible misuse by malicious actors to impersonate public figures or commit fraud. OpenAI, aware of these implications, has been cautious in its approach to avoid attracting criticism.

Rollout and Accessibility

The new features will be accessible to paying Plus and Enterprise subscribers in two weeks. Users must navigate to the app’s “settings” menu to activate voice features, head to “new features,” and opt-in to voice conversations. After this, users will select their preferred voice by tapping the headphone button in the top-right corner.

Initially, the voice feature will be available on an opt-in beta basis on the ChatGPT Android and iOS apps, while the image search feature will be available on all platforms by default.



Source link