MosaicML
Open Weights

mpt-7b-chat

Released May 2023

Arena AI
#268
Context2K
Parameters6.7B

MPT-7B-Chat is a 6.7 billion parameter decoder-only transformer model fine-tuned for dialogue applications, developed by MosaicML as part of the MPT (MosaicML Pretrained Transformer) family. The model architecture incorporates FlashAttention for computational efficiency and ALiBi (Attention with Linear Biases), which enables the model to handle and extrapolate to long sequence lengths. It was initially pretrained on 1 trillion tokens of text and code and subsequently fine-tuned on a collection of conversational datasets including ShareGPT-Vicuna, HC3, Alpaca, HH-RLHF, and Evol-Instruct. Due to the licenses of the fine-tuning datasets, MPT-7B-Chat is released under a non-commercial CC BY-NC-SA-4.0 license.

Rankings & Comparison