← Back to homepage

Active Illumination for Dynamic 3D Reconstruction

Can physically modelling active illumination directly from raw sensor measurements improve scene estimation and avoid errors from derived depth?

Time-of-flight and structured-light cameras are typically used as depth sensors: their raw measurements are processed into a per-pixel depth map, and downstream reconstruction methods treat that depth as input. But depth processing makes simplifying assumptions about the scene, creating noise in low-reflectance regions, flying pixels under multi-path interference, and motion artifacts in fast-moving scenes—each depth estimate needs multiple illumination readings. Further, derived depth is difficult to integrate with other sensor modalities, like colour cameras.

Thread overview diagram

Our work rethinks reconstruction for heterogeneous multi-shot imaging processes. Built upon a differentiable forward model of how the active illumination produces the raw sensor output for a given scene, our methods optimise a 4D volumetric scene representation (like NeRF or 3DGS) so that rendered measurements match what the sensor captured. This lets us integrate sensor measurements over spacetime in a principled way, including across modalities, to reduce noise, resolve ambiguities in multi-shot sensing, and improve robustness to multi-path interference. And since we model motion over time, we can resample fast motion—a swinging baseball bat—into slow motion.

Authors

Benjamin Attal · Anh Duong · Aaron Gokaslan · Zixuan Guo · Changil Kim · Hakyeong Kim · Min H. Kim · Eliot Laidlaw · Runfeng Li · Marc Mapeke · Andreas Meuleman · Matthew O'Toole · Mikhail Okunev · Christian Richardt · Aarrushi Shandilya

Papers in this thread

Neural Information Processing Systems (NeurIPS), 2021
Establishes that a 4D scene can be supervised directly by continuous-wave ToF phasor measurements rather than processed depth, with added colour cameras, showing low noise, super-resolution, and better multi-path handling.
European Conference on Computer Vision (ECCV), 2024
Adds motion vectors that are jointly estimated with geometry. Uses four raw frames (not phasors) captured over time from a continuous-wave ToF sensor to create a coherent dynamic reconstruction. 20× less depth error on dynamic objects than the C-ToF baseline.
Computer Vision and Pattern Recognition (CVPR), 2025
Applies raw ToF supervision to a Gaussian splatting backbone, with two heuristics that stabilise the otherwise-brittle 3DGS optimisation when depth is not directly measured. Comparable quality to neural volumetric baselines while training ~100× faster.

Related papers

International Conference on Computer Vision (ICCV), 2023
Carries the supervision-by-raw-measurement approach from ToF over to structured light, and lets us separate direct and ambient illumination. Recovers higher-fidelity depth on objects than commodity structured light sensors, including for partially-transparent surfaces.
European Conference on Computer Vision (ECCV), 2022
Fuses ToF depth with stereo from a smartphone's optically-stabilised main RGB camera, where the floating lens has unknown pose. Self-calibrates the multi-sensor geometry from a single snapshot, then fuses via a correlation volume.