Command A+ by Cohere: LLM Benchmarks, Rankings & Specs

Command A+ is a large-scale language model developed by Cohere, released in May 2026 as the primary flagship of the Command A family. It is a sparse Mixture-of-Experts (MoE) model featuring 218 billion total parameters with 25 billion active parameters per token. The model is designed for high-performance enterprise applications, specifically targeting complex agentic workflows, long-context reasoning, and multimodal document processing under a permissive Apache 2.0 license.

Architecture and Performance

The model's architecture consists of a decoder-only transformer with 128 total experts, utilizing a "dropless" token-choice router that selects 8 active experts per token alongside a single shared expert applied to all tokens. Its attention mechanism incorporates a 3:1 ratio of sliding-window attention (first introduced in the base Command A model) and global attention layers. This design is optimized for hardware efficiency, allowing the model to be deployed on as few as one NVIDIA B200 or two H100 GPUs while maintaining high throughput and low latency.

Capabilities and Tool Use

Command A+ is natively multimodal, supporting both text and vision inputs for tasks such as image-based reasoning and document analysis. It supports 48 languages, including all official European Union languages, and demonstrates advanced capabilities in Tool Use and agentic reasoning. The model can interact with external APIs, databases, and search engines to solve multi-step problems. It also features native citation support, embedding special tags in its output to link factual claims directly to source documents, which enhances its utility in retrieval-augmented generation (RAG) environments.

Implementation Guidance

For optimal performance in agentic tasks, users are encouraged to provide tool descriptions using standard JSON schemas. The model's 128,000-token context window allows for the processing of extensive enterprise datasets. Developers can utilize the model via official weights in various quantization formats, including BF16, FP8, and W4A4, to suit different hardware constraints without significant quality loss.

Architecture and Performance

Capabilities and Tool Use

Implementation Guidance

Command A+

Architecture and Performance

Capabilities and Tool Use

Implementation Guidance

Explore AI Studio

Rankings & Comparison

Command A+

Architecture and Performance

Capabilities and Tool Use

Implementation Guidance

Explore AI Studio

Rankings & Comparison