KlingAI logo
KlingAI

Kling 1.0

Released Jun 2024

Kling 1.0 is a generative video model developed by Kuaishou Technology that creates high-fidelity video content from text prompts and static images. Designed to simulate complex physical world motions, the model is capable of generating videos up to two minutes in length with a resolution of 1080p at 30 frames per second. It is part of the first wave of highly capable video models released to compete with systems like OpenAI's Sora.

The model is built on a Diffusion Transformer (DiT) architecture, a hybrid approach that combines the generative capabilities of diffusion models with the scaling and attention mechanisms of transformers. Kuaishou enhanced this foundation with a proprietary 3D Variational Autoencoder (VAE) network, which enables synchronous spatiotemporal compression to maintain visual quality and training efficiency while processing high-resolution video data.

Key features of Kling 1.0 include advanced 3D reconstruction technology for realistic character movements and human-environment interactions. Its spatiotemporal modeling module utilizes a full-attention mechanism to capture complex physical laws, allowing for the consistent depiction of fast-moving objects and dramatic scene changes. The model also supports precise camera control, including panning, tilting, and rolling, as well as start-and-end frame conditioning for seamless transitions.

Rankings & Comparison