ERNIE 4.5 300B A47B is a flagship large language model developed by Baidu as part of the ERNIE (Enhanced Representation through Knowledge Integration) 4.5 series. It is built on a Mixture-of-Experts (MoE) architecture, featuring 300 billion total parameters with 47 billion active parameters per token. The model is designed to support high-performance text understanding, reasoning, and generation in both Chinese and English.
Architecture and Design
The model utilizes a fine-grained MoE backbone with 64 experts, activating 8 for each token. It consists of 54 layers and employs Grouped Query Attention (GQA) to improve inference efficiency. The architecture supports an extended context window of 131,072 tokens (128K) and was trained using a heterogeneous hybrid parallelism strategy on Baidu's PaddlePaddle framework. It is optimized for low-latency deployment through support for 4-bit and 2-bit quantization.
Capabilities and Release
ERNIE 4.5 300B A47B is engineered for complex tasks including mathematical reasoning, code generation, and instruction following. According to technical evaluations, the model demonstrates competitive performance compared to other leading open and closed-source models in benchmarks like IFEval, SimpleQA, and ChineseSimpleQA. It was released under the Apache License 2.0, providing open access for research and commercial development.