mpt-7b-chat by MosaicML: LLM Benchmarks, Rankings & Specs

MPT-7B-Chat is a 6.7 billion parameter decoder-only transformer model fine-tuned for dialogue applications, developed by MosaicML as part of the MPT (MosaicML Pretrained Transformer) family. The model architecture incorporates FlashAttention for computational efficiency and ALiBi (Attention with Linear Biases), which enables the model to handle and extrapolate to long sequence lengths. It was initially pretrained on 1 trillion tokens of text and code and subsequently fine-tuned on a collection of conversational datasets including ShareGPT-Vicuna, HC3, Alpaca, HH-RLHF, and Evol-Instruct. Due to the licenses of the fine-tuning datasets, MPT-7B-Chat is released under a non-commercial CC BY-NC-SA-4.0 license.

mpt-7b-chat

Explore AI Studio

Rankings & Comparison