Papers
arxiv:2512.22036

FUSCO: High-Performance Distributed Data Shuffling via Transformation-Communication Fusion

Published on Dec 26, 2025
Authors:
,
,
,
,
,
,
,
,
,
,
,

Abstract

FUSCO is a communication library designed for mixture-of-experts models that improves training and inference efficiency by optimizing data shuffling through fused transformations and a pipelined communication engine.

AI-generated summary

Large-scale Mixture-of-Experts (MoE) models rely on expert parallelism for efficient training and inference, which splits experts across devices and necessitates distributed data shuffling to route each token to its assigned experts. However, existing communication libraries handle this shuffling poorly; its overhead can account for over half of end-to-end runtime. We present FUSCO, an MoE-friendly communication library that achieves efficient and lightweight data shuffling through fused data transformation and communication, based on the key observation that MoE's expert-major data layout conflicts with the device-major layout expected by communication operations. FUSCO captures the fine-grained data layout, which is then interpreted by a pipelined communication engine that performs the required shuffling efficiently along the communication path. Lightweight planning and load-balancing mechanisms complement the engine by eliminating redundant communication and dispersing traffic. Evaluations on representative benchmarks illustrate that FUSCO achieves up to 3.84times and 2.01times speedups over NCCL and DeepEP (the state-of-the-art MoE communication library), respectively. In end-to-end MoE tasks, compared to NCCL and DeepEP, FUSCO reduces the training latency by 1.17-1.39times and 1.10-1.19times, and lowers the first-token generation latency in inference by 1.09-1.25times and 1.06-1.16times.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2512.22036
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.22036 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.22036 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.22036 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.