JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation
Abstract
Generative models often treat continuous data and discrete events as separate processes, creating a gap in modeling complex systems where they interact synchronously. To bridge this gap, we introduce JointDiff, a novel diffusion framework designed to unify these two processes by simultaneously generating continuous spatio-temporal data and synchronous discrete events. We demonstrate its efficacy in the sports domain by simultaneously modeling multi-agent trajectories and key possession events. This joint modeling is validated with non-controllable generation and two novel controllable generation scenarios: weak-possessor-guidance, which offers flexible semantic control over game dynamics through a simple list of intended ball possessors, and text-guidance, which enables fine-grained, language-driven generation. To enable the conditioning with these guidance signals, we introduce CrossGuid, an effective conditioning operation for multi-agent domains.
BibTeX
@inproceedings{capellera2026jointdiff,
title={JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation},
author={Capellera, Guillem and Ferraz, Luis and Rubio, Antonio and Alahi, Alexandre and Agudo, Antonio},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}