OpenAI Unveils Sora 2: New AI Video Generator Focuses on Audio and Personal Avatars

[22:20 Tue,30.September 2025 by Rudi Schmidts]

Just a year after introducing its first video generator, OpenAI has unveiled "Sora 2," its successor (presented in a video stream). The company describes it as the "most powerful imagination engine ever built." The innovations extend significantly beyond merely higher-quality videos, encompassing, among other things, integrated audio generation and a personalized "Cameo" feature.

According to Bill Peebles, Head of Sora, the new model marks a "huge leap forward in terms of realism." The model is State-of-the-Art (SOTA) in motion, physics, and body mechanics. Complex dynamic scenes, such as an Olympic gymnastics routine or a backflip on a wakeboard, which posed a significant challenge for earlier models, can now be rendered much more robustly and naturally.

Another major advance is improved controllability. While previous iterations often required a shot-by-shot approach, Sora 2 can now generate longer, coherent stories with multiple shots in a single pass.

One of the most striking new features is integrated audio generation. Sora 2 is the company&s first model to generate video and sound simultaneously (Google was the pioneer in this field a few months ago with Veo 3). The system is versatile and can produce dialogues in various languages with multiple speakers, sound effects, and even complete soundscapes.

However, perhaps the most groundbreaking new feature is "Cameo." It allows users to insert themselves or other individuals into any Sora-generated scene. To do this, a short video recording of a person (or even a pet) is analyzed. The model learns their appearance, including voice (!!), and can then integrate it into any desired scene like a command word. The term "appearance" is used because not only people but also animals or other objects can be inserted into Sora 2.

Sam Altman is often featured in the Sora 2 presentation, but always as a Cameo.

OpenAI emphasizes security and control: To create a Cameo, users must undergo an identity check that includes a live recording with a voice sample and motion analysis. Users then have full control over who can use their likeness – from "no one but me" to "mutual connections" to "everyone."

To make the new model&s capabilities tangible, OpenAI is first launching a "Sora App." This app resembles a social network like TikTok, where the entire feed consists of AI-generated videos created and shared by members.

The interface resembles established social media apps, featuring a feed, profiles, and the ability to follow others. A "Remix" function allows users to immediately pick up on trends and storylines from other users and create their own variations. According to Thomas from the engineering team, the app is intended to be a "new form of communication" that focuses on playful creativity and connecting with friends, rather than passive consumption. That certainly sounds like TikTok meets BeReal. BeUnreal!

OpenAI emphasizes that security and transparency are priorities. All videos leaving the app will be marked with a Sora watermark. Additionally, provenance proofs (C2PA) and internal tracking systems are employed. Specifically for the Cameo feature, strict moderation tools have been implemented to prevent the creation of inappropriate content.

The Sora App will initially launch as an invitation-only version in the iOS App Store in the USA and Canada. Each user will receive four invitation codes to invite friends. The Sora 2 model will also be available via the web interface sora.com and soon via API for developers.

deutsche Version dieser Seite: OpenAI zeigt Sora 2: Neuer KI-Videogenerator setzt auf Ton und persönliche Avatare