MPT-7B-Chat is a 6.7 billion parameter decoder-only transformer model fine-tuned for dialogue applications, developed by MosaicML as part of the MPT (MosaicML Pretrained Transformer) family. The model architecture incorporates FlashAttention for computational efficiency and ALiBi (Attention with Linear Biases), which enables the model to handle and extrapolate to long sequence lengths. It was initially pretrained on 1 trillion tokens of text and code and subsequently fine-tuned on a collection of conversational datasets including ShareGPT-Vicuna, HC3, Alpaca, HH-RLHF, and Evol-Instruct. Due to the licenses of the fine-tuning datasets, MPT-7B-Chat is released under a non-commercial CC BY-NC-SA-4.0 license.
Explore AI Studio
Access 50+ top AI models for image, 3D, and audio generation in one unified workspace.
Open AI Studio