Dynamic 3D Gaussians:
Tracking by Persistent Dynamic View Synthesis

1Carnegie Mellon University, USA 2RWTH Aachen University, Germany 3Inria & Universite Cote d’Azur, France

Abstract

We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians which are optimized to reconstruct input images via differentiable rendering. To model dynamic scenes, we allow Gaussians to move and rotate over time while enforcing that they have persistent color, opacity, and size. By regularizing Gaussians' motion and rotation with local-rigidity constraints, we show that our Dynamic 3D Gaussians correctly model the same area of physical space over time, including the rotation of that space. Dense 6-DOF tracking and dynamic reconstruction emerges naturally from persistent dynamic view synthesis, without requiring any correspondence or flow as input. We demonstrate a large number of downstream applications enabled by our representation, including first-person view synthesis, dynamic compositional scene synthesis, and 4D video editing.

Video Results

Click videos below to play. Click white arrows to scroll across to see more videos.
If too many are playing at once, computer may lag, refresh page and continue browsing.

Novel-View Synthesis + Track Trajectories

Novel-View rendering visualizing the 3D trajectories (with occlusions) of a random 3% of the Gaussians over the last 15 timesteps (0.5s) of the sequence.

Depth Synthesis + Track Trajectories

Novel-View depth rendering also showing the point track trajectories.

6-DOF (including Rotation) Tracking

Visualization of the relative rotation of tracked points. Coloured vectors are initialized to face left in camera-space, and rotate in 3D over time along with the rotation of their respective Gaussians.

Comparison with Ground-Truth 3D Tracks

Comparison between predicted 3D tracks (blue) and ground-truth 3D tracks (red)

Underlying Gaussian Centers

Coloured point cloud showing the Dynamic 3D Gaussian centers over time.

Gaussian-Eye View: First-Person

By placing a camera at at a particular Gaussian, we can render the world from the point-of-view of that Gaussian as it moves over time. Below, we present the actor's first-person view.

Gaussian-Eye View: Object View

We can also render the from the perspective of objects in the scene. Note, the renderer doesn't perform as well with very close Gaussians.

Compositional Dynamic Scenes

Gaussians from different dynamic scenes can easily be composed together and rendered to make new scenes.

Scene-Conditional Object Insertion

We can reconstruct new objects from static scenes (e.g. like this hat) and attach them to a selected Gaussian such that they move and rotate along with the dynamic scene.

Edit Propagation

Edits can made (e.g. in photoshop) to any rendered image, which can then be 'baked' back into the Gaussians and propagated to all timesteps.

BibTeX

@inproceedings{luiten2023dynamic,
  title={Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis},
  author={Luiten, Jonathon and Kopanas, Georgios and Leibe, Bastian and Ramanan, Deva},
  booktitle={3DV},
  year={2024}
}