gpt-oss-120B (high) by OpenAI: LLM Benchmarks, Rankings & Specs

gpt-oss-120b is a large-scale, open-weight language model developed by OpenAI, released as part of the GPT-OSS series. It utilizes a Mixture-of-Experts (MoE) architecture with approximately 117 billion total parameters, of which 5.1 billion are active per token. The model is designed for high-reasoning, agentic tasks and production use cases, offering performance comparable to proprietary reasoning models while being available under a permissive Apache 2.0 license.

Architecture and Design

The model consists of 36 layers with 128 experts per layer, employing a top-4 routing mechanism. It features alternating dense and locally banded sparse attention patterns and utilizes Grouped Multi-Query Attention (GQA) with a group size of 8 for inference efficiency. It supports a native context length of 128,000 tokens and uses the o200k_harmony tokenizer, which is optimized for STEM, coding, and multilingual data.

Key Capabilities

As a reasoning-focused model, gpt-oss-120b supports a configurable reasoning effort setting (low, medium, or high), allowing users to scale the depth of its internal chain-of-thought (CoT). This reasoning capability enables the model to excel in competition-level coding, advanced mathematics, and complex multi-step tool use. The model is post-trained using reinforcement learning techniques similar to those used in OpenAI's frontier reasoning systems, such as the o-series.

Efficiency and Deployment

To facilitate deployment on consumer-grade and enterprise hardware, the model was released with native MXFP4 quantization. This optimization allows the 117B parameter model to fit within the 80GB of memory provided by a single NVIDIA H100 or AMD MI300X GPU. It is compatible with standard inference libraries and follows the Harmony prompt format for structured interactions and tool calling.

Architecture and Design

Key Capabilities

Efficiency and Deployment

gpt-oss-120B (high)

Architecture and Design

Key Capabilities

Efficiency and Deployment

Explore AI Studio

Rankings & Comparison

gpt-oss-120B (high)

Architecture and Design

Key Capabilities

Efficiency and Deployment

Explore AI Studio

Rankings & Comparison