1.
Petrov N, Andersson S. Sparse Experts Scale Better in Efficient Mixture Architectures for Trillion Parameter Models. CPL [Internet]. 2026 May 18 [cited 2026 May 19];14(2):16-22. Available from: https://computer-life.org/index.php/ojs/article/view/41