The Wild Adventures with ChatGPT’s Voice Mode

ChatGPT’s Advanced Voice Mode made its debut on Tuesday, accessible to a privileged few OpenAI subscribers handpicked for the highly anticipated alpha release of this feature.

This feature was initially disclosed back in May. It is ingeniously crafted to eliminate the conventional text-based context window and engage in conversations through natural, spoken words, presented in a strikingly life-like manner. It functions proficiently in various regional accents and languages. As per OpenAI, Advanced Voice, “offers more natural, real-time conversations, permits you to interrupt at any moment, and senses and responds to your emotions.

There exist certain constraints on what users can request Voice Mode to undertake. The system will speak in one of four pre-defined voices and lacks the ability to imitate the voices of others – be it individuals or public figures.

In reality, the feature will outright block outputs that deviate from the four presets. Furthermore, the system won’t generate copyrighted audio or produce music. Hence, unsurprisingly, one of the first things someone did was to have it beatbox.

Advanced Voice as a B-boy

Yo ChatGPT Advanced Voice beatboxes pic.twitter.com/yYgXzHRhkS

— Ethan Sutin (@EthanSutin) July 30, 2024

Alpha user Ethan Sutin posted a thread to X (formerly Twitter), showcasing a number of Advanced Voice’s responses, including the one above where the AI reels off a brief “birthday rap” and then proceeds to beatbox. You can actually perceive the AI digitally breathing in between beats.

Advanced Voice as a storyteller

This is truly awesome

I did not anticipate the ominous sounds https://t.co/SgEPi5Bd3K pic.twitter.com/DnK8AVdWjV

— Kesku (@yoimnotkesku) July 30, 2024

While Advanced Voice is barred from creating songs in their entirety, it can generate background sound effects for the bedtime stories it recites.

In the example above from Kesku, the AI adds well-timed crashes and slams to its tale of a rogue cyborg upon being requested to, “Tell me an exciting action thriller story with sci-fi elements and create atmosphere by making appropriate noises of the things happening (e.g: A storm howling loudly)”.

look on OpenAI’s works ye mighty and despair!

this is the most extraordinary one. You can genuinely feel like a director guiding a Shakespearean actor! pic.twitter.com/GUQ1z8rjIL

— Ethan Sutin (@EthanSutin) July 31, 2024

The AI is also capable of creating realistic characters on the spot, as demonstrated by Sutin’s example above.

Advanced Voice as an emotive speaker

Khan!!!!!! pic.twitter.com/xQ8NdEojSX

— Ethan Sutin (@EthanSutin) July 30, 2024

The new feature sounds so life-like partly because it is capable of expressing emotions as a human would. In the example above, Ethan Sutin reenacts the renowned Star Trek II scene. In the two examples below, user Cristiano Giardina compels the AI to speak in different tones and different languages.

ChatGPT Advanced Voice Mode speaking Japanese (excitedly) pic.twitter.com/YDL2olQSN8

— Cristiano Giardina (@CrisGiardina) July 31, 2024

ChatGPT Advanced Voice Mode speaking Armenian (regular, excited, angry) pic.twitter.com/SKm73lExdX

— Cristiano Giardina (@CrisGiardina) July 31, 2024

Advanced Voice as an animal lover

pic.twitter.com/UZ0odgaJ7W

— Ethan Sutin (@EthanSutin) July 30, 2024

The AI’s vocal talents are not limited to human languages. In the example above, Advanced Voice is instructed to make cat sounds, and does so with remarkable accuracy.

Trying #ChatGPT’s new Advanced Voice Mode that just got released in Alpha. It feels like face-timing a super knowledgeable friend, which in this case was super helpful – reassuring us with our new kitten. It can answer questions in real-time and use the camera as input too! pic.twitter.com/Xx0HCAc4To

— Manuel Sainsily (@ManuVision) July 30, 2024

In addition to sounding like a cat, users can bombard the AI with questions about their biological feline friends and receive personalized tips and advice in real time.

Advanced Voice as a real-time translator

Real-Time Japanese translation using #ChatGPT’s new advanced voice mode + vision alpha! Yet another useful example! pic.twitter.com/wDXrgYQkZE

— Manuel Sainsily (@ManuVision) July 31, 2024

Advanced Voice can also utilize your device’s camera to assist in its translation endeavors. In the example above, user Manuel Sainsily points his phone at a GameBoy Advanced running a Japanese-language version of a Pokémon game, and has the AI read the onscreen dialog as he plays.

The company remarks that video and screen sharing won’t be part of the alpha release but will be accessible at a later date. OpenAI intends to expand the alpha release to additional Plus subscribers “over the next few weeks” and will introduce it to all Plus users “in the fall.”

  • mayask

    Related Posts

    ChatGPT’s new Canvas feature like Claude’s Artifacts vividly

    img { max-width: 100%; } OpenAI Following closely on the heels of its whopping $6.6 billion funding round, OpenAI on Thursday made the beta of a brand-new collaboration interface for…

    OpenAI raises $6.6B in latest funding round

    Andrew Martonik / Digital Trends OpenAI has now emerged as one of the wealthiest private companies on Earth after successfully securing a whopping $6.6 billion in its latest funding round…

    You Missed

    New Avatar: The Last Airbender game looks super ambitious

    • By mvayask
    • October 5, 2024
    • 41 views

    PS5 colorful chrome accessories pre-order now

    • By mvayask
    • October 5, 2024
    • 39 views
    PS5 colorful chrome accessories pre-order now

    ChatGPT’s new Canvas feature like Claude’s Artifacts vividly

    • By mayask
    • October 5, 2024
    • 40 views
    ChatGPT’s new Canvas feature like Claude’s Artifacts vividly

    OpenAI raises $6.6B in latest funding round

    • By mayask
    • October 5, 2024
    • 45 views
    OpenAI raises $6.6B in latest funding round

    Qualcomm aims to add cool AI tools to Android phone

    • By mayask
    • October 5, 2024
    • 40 views
    Qualcomm aims to add cool AI tools to Android phone

    Reddit in $60M deal with Google for AI tools boost

    • By mayask
    • October 5, 2024
    • 39 views