Practical Scene & Object Reconstruction

How do we reconstruct scenes and objects from photographs, especially for unstructured capture, complex appearance, and large places?

In my lab and with our great collaborators, we reconstruct scenes and objects from images. Each project fits a representation of 3D shape and appearance so that the model's rendered images match the captured ones. With many images in controlled conditions, this problem is solvable; what makes it interesting is when we work with few casual captures, or with challenging appearance, or in large and uncontrolled conditions. These cases often contain visual ambiguities and rarely have a single solution.

We've tried to tackle this space of problems: with large scenes both indoors and out; with handheld, 360, or airborne cameras, with sparse and wide baselines; with surfaces that interreflect, refract, and are lit by many sources. Across the projects, the contribution often focuses on what specific information can reduce the ambiguity: a physically based rendering model that better matches real light and cameras, or a scene representation matched to the scene's own structure. This idea applies from a single object up to a building or street scene, from a phone to a drone, and from synthesising new views of a scene to measuring its properties.

Authors

Hujun Bao · Bach-Thuan Bui · Dongyoung Choi · Jaemin Cho · Loudon Cohen · Zheng Dong · Michael Fairley · Yaoan Gao · Purvi Goel · James Guesman · Hyunho Ha · Qixing Huang · Hyeonjoong Jang · Woohyun Kang · Hakyeong Kim · Min H. Kim · Andreas Meuleman · Minh-Hieu Nguyen · Yifan Peng · Daniel Ritchie · Belal Shaheen · Yujun Shen · Shubham · Vikas Thamizharasan · Chi Wang · Huamin Wang · Qi Wang · Michael Wu · Tim Wu · Xiuchao Wu · Jiamin Xu · Weiwei Xu · Matthew David Zane · Xin Zhang · Zihan Zhu · Changqing Zou

Papers in this thread

Shape from Tracing: Towards Reconstructing 3D Object Geometry and SVBRDF Material from Images via Differentiable Path Tracing

International Conference on 3D Vision (3DV), 2020

Uses differentiable path tracing—with global illumination effects like interreflection in the forward model—to refine a coarse mesh and its per-facet SVBRDF, so shading, shadow, and material are jointly disambiguated from images captured by phone and consumer 360 camera.

Scalable Neural Indoor Scene Rendering

Transactions on Graphics (SIGGRAPH), 2022

Tiles a large indoor scene and assigns a small MLP per tile, with a separate view-dependent branch for reflections, so training distributes across GPUs and rendering stays interactive—on scenes over 100 square metres.

ScaNeRF: Scalable Bundle-Adjusting Neural Radiance Fields for Large-Scale Scene Rendering

Transactions on Graphics (SIGGRAPH Asia), 2023

Pushes the tiled-NeRF idea into the bundle-adjusting regime: each tile carries a hash grid plus diffuse and specular MLPs, and ADMM reaches camera-pose consensus across tiles, with a specular-aware warping loss giving the poses a second optimisation path.

Local Gaussian Density Mixtures for Unstructured Lumigraph Rendering

SIGGRAPH Asia Conference Papers, 2024

Targets curved-surface reflections and refractions—exactly where view-consistent global density models break—with a per-view Gaussian-mixture density along each ray, then warps and fuses these local volumes with learned blending weights for unstructured lumigraph rendering.

Efficient Object Reconstruction with Differentiable Area Light Shading

SIGGRAPH Asia Conference Papers, 2025

Replaces point lights with active area lighting during capture, then differentiates through linearly transformed cosines plus shadow visibility weighting for shading—recovering material at +3 dB relighting PSNR or matching point-light quality from a fifth of the photos.

Related papers

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Computer Vision and Pattern Recognition (CVPR), 2024

Reconstructs scenes captured by a small-baseline circular sweep of a 360 camera by placing an SDF inside an adaptively subdivided spherical binoctree, whose geometry matches the capture setting and keeps memory in line with detail.

Splat-based Gradient-domain Fusion for Seamless View Transition

International Conference on 3D Vision (3DV), 2026

Tackles the sparse-view, wide-baseline regime where Gaussian splatting drops geometry—anchors it with two-view stereo, fills intermediate viewpoints via reprojection, and fuses in the gradient domain so colour transitions stay smooth across views.

TreeDGS: Aerial Gaussian Splatting for Distant DBH Measurement

MDPI Remote Sensing, 2026

Treats aerial Gaussian splatting as a measurement instrument: trunks span only a few pixels from altitude, so the method extracts a dense opacity-weighted point set, isolates trunk samples, and fits solid circles to estimate diameter-at-breast-height—4.79 cm RMSE, below a LiDAR baseline.

Associate Professor

Visual Computing

Contact