Hierarchical Mixture-of-Experts with Two-Stage Optimization — AI News