Alibaba launches more efficient Qwen3-Next AI model
Jinse Finance reported that Alibaba's Tongyi Qianwen has released the next-generation foundational model architecture, Qwen3-Next, and open-sourced the Qwen3-Next-80B-A3B series models based on this architecture. Compared to the MoE model structure of Qwen3, this architecture features the following core improvements: hybrid attention mechanism, high-sparsity MoE structure, a series of training-stable optimizations, and a multi-token prediction mechanism that enhances inference efficiency. Based on the Qwen3-Next model structure, Alibaba has trained the Qwen3-Next-80B-A3B-Base model, which has 80 billion parameters but only activates 3 billion parameters. This Base model achieves performance comparable to or slightly better than the Qwen3-32B dense model, while its training cost (GPU hours) is less than one-tenth of Qwen3-32B, and its inference throughput for contexts above 32k is more than ten times that of Qwen3-32B, achieving exceptional cost-effectiveness in both training and inference.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
CME data center outage halts futures and options trading, affecting contracts worth trillions of dollars
CME EBS market will open at 20:00
BNB Treasury company CEA Industries appoints digital asset expert Annemarie Tierney as director
