Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
Paper โข 2304.01893 โข Published
A DDPM (Denoising Diffusion Probabilistic Model) that generates human-like 2D navigation trajectories for robots.
Given a robot's current state (position + velocity) and a goal, this model generates future waypoints that mimic human walking โ smooth curves, natural speed changes, and obstacle-aware paths.
Input: [x, y, vx, vy] + [goal_x, goal_y]
โ DDPM Reverse Diffusion (100 steps)
โ 1D Temporal UNet + FiLM conditioning
Output: 16 future waypoints [dx, dy]
| Component | Details |
|---|---|
| Backbone | 1D Temporal UNet ([64, 128, 256]) |
| Conditioning | FiLM (Feature-wise Linear Modulation) |
| Noise Schedule | Cosine (Improved DDPM) |
| Diffusion Steps | 100 |
| Parameters | 1,801,538 (1.8M) |
| Prediction | ฮต-prediction (noise) |
2,000 synthetic episodes in a 20m ร 20m environment with 8 obstacles:
import torch, json, numpy as np
# Load
config = json.load(open('config.json'))
stats = json.load(open('normalization_stats.json'))
# Build model (copy architecture classes from this repo)
model = HumanTrajDiffusion(ad=2, sd=4, gd=2, H=16, T=100, dims=tuple(config['down_dims']))
model.load_state_dict(torch.load('model.pt', map_location='cpu'))
model.eval()
# Robot at (5,5) moving NE โ goal (15,15)
state = np.array([5.0, 5.0, 0.5, 0.3])
goal = np.array([15.0, 15.0])
state_n = torch.tensor((state - stats['state_mean']) / stats['state_std'], dtype=torch.float32)
goal_n = torch.tensor((goal - stats['goal_mean']) / stats['goal_std'], dtype=torch.float32)
# Generate 5 diverse paths
trajectories = model.generate(state_n, goal_n, n=5)
# โ Real coordinates
traj = trajectories.numpy() * stats['action_std'] + stats['action_mean']
positions = np.cumsum(traj, axis=1) + state[:2]
# positions.shape = (5, 16, 2) โ 5 paths, 16 waypoints, (x,y)
{
"horizon": 16,
"action_dim": 2,
"state_dim": 4,
"goal_dim": 2,
"num_diffusion_steps": 100,
"down_dims": [
64,
128,
256
],
"batch_size": 32,
"total_steps": 8000,
"lr": 0.0002,
"weight_decay": 1e-05,
"warmup_steps": 200,
"grad_clip": 10.0,
"eval_freq": 2000,
"log_freq": 25,
"hub_model_id": "precison9/human-like-robot-nav-diffusion"
}
{
"state_mean": [
9.887735366821289,
10.40771484375,
0.02240574173629284,
-0.010746479965746403
],
"state_std": [
4.021646976470947,
3.9589571952819824,
0.7364981174468994,
0.7464056015014648
],
"action_mean": [
0.0022544937673956156,
-0.001080495654605329
],
"action_std": [
0.07394769042730331,
0.07494954019784927
],
"goal_mean": [
10.106578826904297,
10.3273344039917
],
"goal_std": [
4.950056076049805,
5.060120582580566
]
}