Active Illumination for Dynamic 3D Reconstruction
Can physically modelling active illumination directly from raw sensor measurements improve scene estimation and avoid errors from derived depth?
Time-of-flight and structured-light cameras are typically used as depth sensors: their raw measurements are processed into a per-pixel depth map, and downstream reconstruction methods treat that depth as input. But depth processing makes simplifying assumptions about the scene, creating noise in low-reflectance regions, flying pixels under multi-path interference, and motion artifacts in fast-moving scenes—each depth estimate needs multiple illumination readings. Further, derived depth is difficult to integrate with other sensor modalities, like colour cameras.
Our work rethinks reconstruction for heterogeneous multi-shot imaging processes. Built upon a differentiable forward model of how the active illumination produces the raw sensor output for a given scene, our methods optimise a 4D volumetric scene representation (like NeRF or 3DGS) so that rendered measurements match what the sensor captured. This lets us integrate sensor measurements over spacetime in a principled way, including across modalities, to reduce noise, resolve ambiguities in multi-shot sensing, and improve robustness to multi-path interference. And since we model motion over time, we can resample fast motion—a swinging baseball bat—into slow motion.
Authors
Benjamin Attal · Anh Duong · Aaron Gokaslan · Zixuan Guo · Changil Kim · Hakyeong Kim · Min H. Kim · Eliot Laidlaw · Runfeng Li · Marc Mapeke · Andreas Meuleman · Matthew O'Toole · Mikhail Okunev · Christian Richardt · Aarrushi Shandilya
Papers in this thread
Related papers