New open source video AI: Genmo Mochi 1

[16:59 Fri,25.October 2024 by Thomas Richter]

And once again, a promising new open-source video AI has been released: Mochi 1 is the name of the new SOTA (State-of-the-Art) video model from the startup Genmo, available under the free Apache 2.0 license along with weights for download, and it can also be tested online for free at www.genmo.ai/play.

According to Genmo, Mochi 1 is set to outperform current top video AIs such as Runway Gen 3, Kling, or Lumas Dream Machine in terms of motion coherence and prompt interpretation. The current version is still a preview, limited to a resolution of 480p (640 x 480 pixels) with a (very high) frame rate of 30 fps and a duration of 5.4 seconds, but an HD version is expected later in the year.

Introducing Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0.

magnet:?xt=urn:btih:441da1af7a16bcaa4f556964f8028d7113d21cbb&dn=weights&tr=udp://tracker.opentrackr.org:1337/announce pic.twitter.com/YzmLQ9g103
— Genmo (@genmoai) October 22, 2024

The new model, with 10 billion parameters in the new Asymmetric Diffusion Transformer (AsymmDiT) architecture, is particularly good at producing coherent and realistic depictions of motion, both for individual objects (like humans and animals) and for complex physical simulations, such as fluid dynamics, fur, or hair movement. Like all good video AIs, Mochi 1 takes a step toward becoming a world simulator capable of realistically depicting anything imaginable.

The 30 fps frame rate, which is very high for a video AI, also ensures smooth motion representation. Another strength is its precise prompt interpretation, meaning how well Mochi 1 can translate a text prompt into moving images, considering the subject, background, and desired actions. Here’s a comparison with leading open-source and commercial (closed-source) video AIs in terms of motion coherence and prompt interpretation:

In principle, Mochi 1 requires 4 of Nvidia&s specialized

H100 GPUs with 80 GB VRAM each, but thanks to the open-source code, there are already modified versions

created by users that require less than 20 GB VRAM, just hours after release.

Mochi 1 is definitely an interesting addition to the current open-source video AIs like

Pyramid Flow and

CogVideoX.

Here

New open source video AI: Genmo Mochi 1

deutsche Version dieser Seite: Genmo Mochi 1 - neue Open-Source Video-KI will mit Kling und Runway konkurrieren