JT-35B-Flash is a large language model developed by China Mobile as part of its Jiutian (JT) series. Designed for high efficiency and large-scale deployment, it belongs to the "Flash" tier of models optimized for rapid inference and lower latency. The model features 35 billion parameters and is positioned as a benchmark for independent AI development by Chinese central state-owned enterprises (SOEs).
Technically, the model supports a 256,000-token context window, allowing for the processing of extensive documents and long-form conversations. Evaluations have highlighted its performance in complex reasoning, with reports indicating a 82.9% score on the GPQA benchmark, which measures expert-level knowledge across scientific disciplines. It also maintains a significant focus on coding proficiency and data analysis tasks.
Architecture and Ecosystem
The model is integrated into China Mobile's "AI Capability United Fleet," a collaborative ecosystem designed to pair domestic software with localized hardware. It has undergone extensive adaptation for domestic GPU architectures, including those from Biren Technology and Moore Threads. These optimizations allow the model to run efficiently on localized infrastructure, ensuring high throughput and stable performance in industry-specific scenarios.
While largely utilized within enterprise and industrial applications in China, the model has gained attention in global benchmarks for its pricing and intelligence ratio. As a general-purpose large model, it is tailored for scenarios requiring a balance of reasoning depth and operational cost-effectiveness.