Solar Mini is a compact large language model developed by Upstage AI, designed to provide high-quality natural language processing with lower computational requirements than many traditional large-scale models. It is built upon the Solar 10.7B foundation, which gained recognition for its performance-to-size ratio and its ability to achieve results comparable to much larger architectures on industry benchmarks.
The model's development is centered on a specialized technique called Depth Up-Scaling (DUS). This method involves depth-wise scaling and continued pre-training, effectively integrating weights from Mistral 7B into an expanded 32-layer structure compatible with the Llama 2 architecture. This approach allows the model to maintain efficiency while improving its reasoning, mathematical, and conversational capabilities.
Solar Mini is optimized for English, Korean, and Japanese, making it a versatile choice for multilingual applications. It is frequently applied in scenarios requiring fast inference and high accuracy, such as document summarization, translation, and interactive chat. The model supports a context window of 32,768 tokens, enabling the processing of long-form documents and extended dialogue histories.