LMSYS
Open Weights

fastchat-t5-3b

Released May 2023

Arena AI
#274
Parameters3B

FastChat-T5-3B is an open-source chatbot model based on the T5 (Text-to-Text Transfer Transformer) architecture. Developed by the Large Model Systems Organization (LMSYS), it was fine-tuned from the FLAN-T5-XL base model using approximately 70,000 user-shared conversations collected from ShareGPT. Unlike many contemporary decoder-only models, FastChat-T5-3B employs an encoder-decoder structure, allowing it to process and generate text efficiently relative to its size.

The model was designed to provide a more compact and commercially viable alternative to larger models like Vicuna. By leveraging the instruction-tuned weights of FLAN-T5 and further fine-tuning on diverse dialogue data, it is capable of following complex instructions and maintaining multi-turn conversations. Despite its 3-billion parameter footprint, it has demonstrated competitive performance in tasks such as common sense reasoning and role-play compared to significantly larger models.

Rankings & Comparison