Competition for Veo and Sora?: Kling Video O1 - Multimodal Model for Generative and Editing Video AI

[12:53 Wed,3.December 2025 by Rudi Schmidts]

The Chinese AI company Kling AI has introduced Video O1, a new model that is intended to be the world&s first unified system for video generation and editing. It combines functions that previously often required separate tools into a single architecture. For example, a user can add a main character, change the background, and adjust the visual style of the video in a single command.

Video O1 is designed to interpret various input types in parallel. It can combine up to seven different elements, such as reference images, video clips, isolated subjects, and text instructions, to generate a consistent result. For editing existing videos, this means users can make changes with commands like "Remove the passersby" or "Switch from day to night shot" without manually masking objects or setting keyframes. Changing camera perspectives also appears to work well.

The following demo video shows how the operation can be imagined:

The system is trained to understand the uploaded elements—be they characters, objects, or entire scenes—and maintain consistency throughout the entire video length. These elements can then be referenced via tags in the prompt.

According to Kling AI, the technology is based on a Transformer architecture and a proprietary "Multimodal Visual Language" (MVL), which serves as the interface between textual and visual signals. By using reasoning chains, the model is intended to be capable of logically deducing events, thus going beyond mere pattern recognition.

In its own tests, Kling AI compared Video O1 with two prominent competing products: the model reportedly significantly outperformed Google Veo 3.1 in generating videos from image references. In another series of tests for video transformations, it was preferred over Runway Aleph&s solution in the majority of cases. However, these positive results stem exclusively from the company&s internal evaluations and have not yet been confirmed or refuted by independent, external benchmarks.

The cherry-picked demos certainly look good:

Video O1 is already available via Kling AI&s web platform. Almost simultaneously,

Runway also introduced a powerful successor model with "Gen-4.5".

more infos at bei app.klingai.com

deutsche Version dieser Seite: Kling Video O1 - Multimodales Modell für generative und editierende Video-KI