NVIDIA llama2-70b-steerlm-chat is a 70 billion parameter generative language model built on the Llama 2 architecture and fine-tuned using NVIDIA's proprietary SteerLM technique. It serves as a steerable alternative to models aligned via Reinforcement Learning from Human Feedback (RLHF), providing more flexible control over model behavior during inference.
The model is characterized by its use of attribute-conditioned supervised fine-tuning. This method enables users to define specific response characteristics—such as helpfulness, correctness, coherence, and complexity—dynamically at inference time. By specifying desired values for these attributes in the prompt, the model can adjust its output style and quality without requiring separate fine-tuned versions for different personas or use cases.
Developed within the NVIDIA NeMo framework, llama2-70b-steerlm-chat was trained on the HelpSteer dataset, which provides multi-attribute helpfulness annotations. At its release, the model achieved a score of 7.54 on the MT-bench leaderboard, positioning it as a competitive open-weights model for commercial applications. It maintains the standard Llama 2 context window of 4096 tokens.