Granite 4.0 H 1B by IBM: LLM Benchmarks, Rankings & Specs

Granite 4.0 H 1B is a lightweight language model developed by IBM as part of the Granite 4.0 "Nano" family. It is designed for high-efficiency performance on edge devices, mobile hardware, and local environments where low latency and a small memory footprint are critical. The model is part of a series that emphasizes enterprise-ready capabilities in a compact form factor, suitable for privacy-preserving and offline applications.

The model features a hybrid architecture that combines the linear-scaling efficiency of Mamba-2 state-space modeling with the selective precision of traditional Transformer attention blocks. Specifically, it utilizes a dense decoder-only structure incorporating Grouped Query Attention (GQA), RMSNorm, and SwiGLU activations. This hybrid approach is engineered to reduce memory usage significantly compared to conventional transformer-only models while maintaining strong performance on local reasoning tasks.

Despite its 1-billion-parameter size, the model supports advanced enterprise workflows including tool calling, function calling, and Retrieval-Augmented Generation (RAG). It is trained on approximately 15 trillion tokens and natively supports 12 languages, including English, German, Spanish, French, Japanese, and Chinese. Additionally, the model includes Fill-In-the-Middle (FIM) capabilities for code completion and is released under the permissive Apache 2.0 license.

Granite 4.0 H 1B

Explore AI Studio

Rankings & Comparison