IBM logo
IBM
Open Weights

Granite 4.0 350M

Released Oct 2025

Intelligence
#462
Coding
#375
Math
#266
Context33K
Parameters350M

The Granite 4.0 350M is a compact, open-weight language model developed by IBM as part of the Granite 4.0 "Nano" family. Designed for edge computing, local device deployment, and resource-constrained environments, it provides a balance between low-latency performance and enterprise-grade accuracy. It is optimized for tasks such as Retrieval-Augmented Generation (RAG), text summarization, and structured data extraction.

Architecture and Training

This model is built on a decoder-only dense transformer architecture, distinguishing it from the hybrid Mamba-2 variants in the same family. It was trained on 15 trillion tokens of diverse data, including a high concentration of code and mathematical content to improve reasoning capabilities. The model supports a 128K token context window, allowing it to process extensive documents locally without requiring high-end server hardware.

Capabilities

Granite 4.0 350M natively supports multilingual capabilities across 12 languages, including English, German, Spanish, French, Japanese, and Chinese. It features significantly improved instruction-following and tool-calling performance compared to previous generations, making it suitable for agentic workflows and function-calling tasks. Additionally, it supports Fill-in-the-Middle (FIM) code completion, enabling integration into developer productivity tools.

Released under the Apache 2.0 license, the model is among the first to receive ISO 42001 certification, adhering to international standards for security, governance, and transparency. Its efficiency allows it to run in environments with as little as 8GB of RAM, facilitating private and offline AI applications on consumer-grade hardware.

Rankings & Comparison