DIRECTOR: Accelerating Distributed MoE Serving via Online Proactive Expert Placement
Published in INFOCOM 2026, 2026
This paper accelerates distributed Mixture-of-Experts (MoE) serving with a proactive online expert placement strategy. It improves end-to-end latency and throughput under dynamic request patterns by balancing communication and compute overheads across GPU servers.
Recommended citation: Qianli Liu, Kaibin Guo, Zicong Hong, Peng Li, Fahao Chen, Haodong Wang, Jian Lin, and Song Guo. (2026). DIRECTOR: Accelerating Distributed MoE Serving via Online Proactive Expert Placement. INFOCOM 2026.
Download Paper
