LMSYS
Open Weights

vicuna-7b

Released Mar 2023

Arena AI
#258
Parameters7B

Vicuna-7B is an open-source conversational large language model developed by LMSYS Org, a research organization featuring collaborators from UC Berkeley, CMU, Stanford, and UC San Diego. It was created by fine-tuning the LLaMA base model (and later Llama 2 in version 1.5) on approximately 70,000 user-shared conversations collected from ShareGPT. The model was designed to demonstrate that a relatively small, efficiently tuned open-source model could achieve conversational quality comparable to proprietary systems.\n\nThe training process for Vicuna-7B utilized supervised fine-tuning (SFT) and was optimized using techniques such as DeepSpeed ZeRO to manage memory requirements. These optimizations allowed the team to train the model on a multi-turn conversation dataset, improving its ability to handle context and follow complex instructions. Upon its release, Vicuna was noted for achieving approximately 90% of the response quality of ChatGPT in initial qualitative assessments.\n\nVicuna-7B also served as the foundation for new evaluation methodologies in the AI community. The developers introduced a framework using GPT-4 as a judge to automatically score chatbot responses, a precursor to the widely used Chatbot Arena leaderboard. This approach helped standardize the evaluation of instruction-following models by providing a scalable alternative to manual human review.

Rankings & Comparison