Major
DBRX: A New Standard for Open LLMs
Databricks released DBRX, a 132B parameter Mixture of Experts (MoE) model with 36B active parameters. It outperformed GPT-3.5, Mixtral, and Grok-1 on standard benchmarks. DBRX used a 16-expert architecture with 4 active experts per token, achieving inference speed comparable to smaller dense models. Released as open-source under permissive license.
- 132B total / 36B active parameters
- 16 experts, 4 active per token
- Outperforms GPT-3.5 and Mixtral
- Inference speed like smaller models
- Open-source MoE architecture