OpenChat logo
OpenChat
Open Weights

openchat-3.5-0106

Released Jan 2024

Arena AI
#223
Context8K
Parameters7B

OpenChat 3.5-0106 is an open-source language model based on the Mistral-7B-v0.1 architecture. Released in January 2024, it is an iterative update to the OpenChat 3.5 series, designed to improve the model's capabilities in coding, mathematical reasoning, and general-purpose conversation. It remains one of the most prominent examples of a high-performing small language model (SLM) that leverages efficient fine-tuning techniques.

The model is distinguished by its use of Conditioned Reinforcement Learning from Fine-Tuning (C-RLFT). Inspired by offline reinforcement learning, C-RLFT allows the model to learn from diverse, mixed-quality datasets without requiring explicit preference labels. By assigning different condition tokens to data of varying quality, the model can be guided to follow high-quality instruction-following patterns during inference.

OpenChat 3.5-0106 provides specific operational modes through its conversation templates, including a Mathematical Reasoning Mode and a generalist mode. Despite its 7 billion parameter size, the model has demonstrated competitive performance against significantly larger models on benchmarks like MT-Bench and HumanEval. It supports a standard context window of 8,192 tokens.

Rankings & Comparison