Kling 2.6 Pro is a professional-grade video generation model developed by Kuaishou Technology. Released in December 2025, it is the first iteration in the Kling series to feature native audio generation, allowing for the simultaneous creation of synchronized visuals and sound in a single inference pass. This capability enables the model to produce dialogue, environmental sound effects, and ambient noise that are temporally aligned with the generated video content without the need for post-production synchronization.
Built on a diffusion-based Transformer architecture, Kling 2.6 Pro utilizes a proprietary 3D variational autoencoder (VAE) for spatiotemporal compression. The model supports text-to-video and image-to-video workflows at 1080p resolution with frame rates up to 48 FPS. It introduces the "Elements" feature, which facilitates character consistency across multiple shots, and advanced motion control tools for precise camera and subject manipulation.
Key technical improvements in this version include enhanced physical realism for cloth and hair simulation, as well as more accurate object interactions. Its Omni One architecture is designed to improve temporal coherence and grounded motion, ensuring that objects obey physics-based constraints like gravity and collision while reducing visual artifacts common in AI-generated video.