Reka Flash is a 21-billion parameter multimodal language model developed by Reka AI. It belongs to the "turbo-class" of models designed to balance high computational efficiency with advanced reasoning and multimodal capabilities. The model is trained from scratch and is capable of processing interleaved inputs including text, images, video, and audio.
The model features a context window of 128,000 tokens, enabling it to ingest and reason over extensive documents and long video sequences. At the time of its initial release, it demonstrated performance competitive with significantly larger proprietary models on standardized benchmarks for language understanding, code generation, and multilingual reasoning.
The specific iteration reka-flash-21b-20240226-online refers to an API-based version of the model released in February 2024. It is optimized for low-latency tasks such as multimodal chat, document analysis, and building agentic workflows that require real-time information processing.