oasst-pythia-12b is an open-source, instruction-tuned language model developed by the OpenAssistant project, an initiative by the non-profit organization LAION. It is based on the Pythia-12B architecture by EleutherAI and was fine-tuned to act as a conversational assistant using the human-annotated OASST1 dataset. The model was designed to provide a transparent and fully reproducible alternative to proprietary chat models, with its weights and training data made publicly available under the Apache 2.0 license.
The model's training involved supervised fine-tuning (SFT) on a diverse dataset of conversational trees, consisting of hundreds of thousands of messages in multiple languages (though this specific variant is optimized for English). By utilizing the Pythia-12B base, it leverages a transformer-based decoder-only architecture that was originally designed for research into the scaling laws and behavior of large language models.
Architecture and Capabilities
With 12 billion parameters, oasst-pythia-12b supports a context window of 2,048 tokens. It employs special tokens such as <|prompter|> and <|assistant|> to differentiate between user and system turns in multi-turn dialogues. While later superseded by models based on the Llama architecture within the OpenAssistant ecosystem, this variant remains a significant milestone for being built upon a completely open-source base model, allowing for unrestricted commercial and research use.