UC Berkeley
Open Weights

starling-lm-7b-alpha

Released Nov 2023

Arena AI
#234
Context8K
Parameters7B

Starling-LM-7B-alpha is an open-source large language model developed by the Berkeley NEST team at UC Berkeley. It is a 7-billion parameter model fine-tuned using Reinforcement Learning from AI Feedback (RLAIF) to improve its helpfulness and safety in conversational settings. The model is built upon OpenChat 3.5, which is itself a derivative of the Mistral-7B architecture.

Training and Methodology

The model's training process centers on a novel reward training and policy tuning pipeline. It utilizes Starling-RM-7B-alpha, a reward model trained to capture human preferences through high-quality ranking data. The final policy optimization was performed using Advantage-Induced Policy Alignment (APA), a method designed to align the model's outputs more effectively than standard supervised fine-tuning alone.

Dataset and Performance

Starling-LM-7B-alpha was trained on the Nectar dataset, which consists of 183,000 chat prompts with 3.8 million pairwise comparisons labeled by GPT-4. At its release, the model demonstrated significant performance for its size, achieving a score of 8.09 on MT-Bench. This allowed it to rival much larger models in conversational benchmarks, focusing on tasks such as reasoning, coding, and creative writing.

Rankings & Comparison