SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model

Inhwan Bae, Young-Jae Park and Hae-Gon Jeon*

Computer Vision and Pattern Recognition 2024

Abstract

There are five types of trajectory prediction tasks: deterministic, stochastic, domain adaptation, momentary observation, and few-shot. These associated tasks are defined by various factors, such as the length of input paths, data split and pre-processing methods. Interestingly, even though they commonly take sequential coordinates of observations as input and infer future paths in the same coordinates as output, designing specialized architectures for each task is still necessary. For the other task, generality issues can lead to sub-optimal performances. In this paper, we propose SingularTrajectory, a diffusion-based universal trajectory prediction framework to reduce the performance gap across the five tasks. The core of SingularTrajectory is to unify a variety of human dynamics representations on the associated tasks. To do this, we first build a Singular space to project all types of motion patterns from each task into one embedding space. We next propose an adaptive anchor working in the Singular space. Unlike traditional fixed anchor methods that sometimes yield unacceptable paths, our adaptive anchor enables correct anchors, which are put into a wrong location, based on a traversability map. Finally, we adopt a diffusion-based predictor to further enhance the prototype paths using a cascaded denoising process. Our unified framework ensures the generality across various benchmark settings such as input modality, and trajectory lengths. Extensive experiments on five public benchmarks demonstrate that SingularTrajectory substantially outperforms existing models, highlighting its effectiveness in estimating general dynamics of human movements.


Keyword: A unified framework ensures the general dynamics of human movements across various input modalities and trajectory lengths, applicable to five different trajectory prediction tasks.



Presentation Video

5-minute presentation video for the CVPR 2024 poster session.



Motivation

Image

Specialized Model for Each Task

There are five types of trajectory prediction tasks: deterministic, stochastic, domain adaptation, momentary observation, and few-shot. Various factors define these tasks, such as the length of input paths, data split, and preprocessing methods. Even though they all use sequential coordinates as input and output, a specialized architecture for each task is still necessary to effectively address its unique challenges and requirements.



Image

1️⃣ SingularTrajectory 1️⃣

Unifying the Motion Space

This paper presents a universal trajectory predictor named SingularTrajectory to achieve generality in predictions. The main idea is to unify the modalities of human dynamic representations across five different tasks. We first introduce a Singular space, an embedding space consisting of representative motion patterns for each task. These motion patterns are extracted using Singular Value Decomposition (SVD) and are used as basis functions of Singular space for pedestrian movements.



Image

Adaptive Anchor Generation

Next, we propose an environment-adaptive anchor that operates within the Singular space. Unlike traditional fixed anchor methods that sometimes fail to handle different target data distributions, our adaptive anchor can correct prototype paths in the Singular space if they are placed in incorrect locations, using an input traversability map. This ensures that the anchor accurately adapts to varying environments and maintains robust trajectory predictions across diverse scenarios.



Image

Anchor-Based Diffusion Model

Lastly, we generate socially acceptable future trajectories for all agents in scenes using a diffusion-based predictor, which denoises the residuals of perturbed prototype paths. Thanks to the cascaded denoising process of diffusion models, we refine these prototype paths. In this process, historical pathways, agent interactions, and environmental information are provided as conditions to guide the trajectories in the Markov chain of the denoising diffusion process.



Image

A Universal Trajectory Prediction Framework
Image


Qualitative Results

Image
Visualization of prediction results on (a) momentary observation task and (b) few-shot task.



BibTeX

@inproceedings{bae2024singulartrajectory,
title={SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model},
author={Bae, Inhwan and Park, Young-Jae and Jeon, Hae-Gon},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2024}
}