ChatGPT Goes Visual: New Video and Screensharing Features Unveiled!

ChatGPT has recently enhanced its Advanced Voice Mode by incorporating video and screensharing capabilities. This feature, initially introduced in May with the rollout of GPT-4o, had previously focused solely on audio interactions. Now, users can engage with ChatGPT using their phone cameras, allowing the model to “see” the same view as the user.

In a live demonstration, OpenAI’s Chief Product Officer, Kevin Weil, along with team members, showcased ChatGPT’s ability to assist in making pour-over coffee. By directing the camera towards the brewing process, the system demonstrated its understanding of coffee-making principles, guiding the team through the preparation steps. Additionally, the livestream highlighted ChatGPT’s screensharing function, which effectively recognized and responded to open messages on a smartphone.

This eagerly awaited update follows Google’s announcement of its latest model, Gemini 2.0, which also possesses advanced visual and audio processing capabilities. Gemini 2.0 can execute multi-step tasks under different project names, targeting varied applications.

Notably, OpenAI’s demo emphasized the visual modality of ChatGPT, which accurately identified objects and was designed to allow for interruptions during interactions. A whimsical addition included a Santa voice option in the Voice Mode, featuring a cheerful tone complete with festive “ho-ho-hos.” Users can interact with this Santa feature by tapping a snowflake icon in the app, although there is a user advisory indicating that the voice option is intended for those aged 13 and older.

As of today, the video and screensharing features are available to ChatGPT Plus and Pro subscribers, with plans for these capabilities to extend to Enterprise and Edu users in January.

This development represents a significant step forward in making AI interactions more integrative and engaging, paving the way for innovative applications in daily tasks and learning. The introduction of such features enhances accessibility and offers users a more immersive experience with AI technology.

In summary, the addition of video and screensharing functionalities to ChatGPT’s Advanced Voice Mode highlights the ongoing evolution of AI capabilities, making it a more valuable tool for users. As technology progresses, these advancements promise to facilitate more interactive and supportive environments in various fields, from education to everyday tasks.

ChatGPT Goes Visual: New Video and Screensharing Features Unveiled!

Popular Categories

Search the website