← Back to homepage
Practical Scene & Object Reconstruction
How do we reconstruct scenes and objects from photographs, especially for unstructured capture, complex appearance, and large places?
In my lab and with our great collaborators, we reconstruct scenes and objects from images. Each project fits a representation of 3D shape and appearance so that the model's rendered images match the captured ones. With many images in controlled conditions, this problem is solvable; what makes it interesting is when we work with few casual captures, or with challenging appearance, or in large and uncontrolled conditions. These cases often contain visual ambiguities and rarely have a single solution.
We've tried to tackle this space of problems: with large scenes both indoors and out; with handheld, 360, or airborne cameras, with sparse and wide baselines; with surfaces that interreflect, refract, and are lit by many sources. Across the projects, the contribution often focuses on what specific information can reduce the ambiguity: a physically based rendering model that better matches real light and cameras, or a scene representation matched to the scene's own structure. This idea applies from a single object up to a building or street scene, from a phone to a drone, and from synthesising new views of a scene to measuring its properties.
Authors
Hujun Bao · Bach-Thuan Bui · Dongyoung Choi · Jaemin Cho · Loudon Cohen · Zheng Dong · Michael Fairley · Yaoan Gao · Purvi Goel · James Guesman · Hyunho Ha · Qixing Huang · Hyeonjoong Jang · Woohyun Kang · Hakyeong Kim · Min H. Kim · Andreas Meuleman · Minh-Hieu Nguyen · Yifan Peng · Daniel Ritchie · Belal Shaheen · Yujun Shen · Shubham · Vikas Thamizharasan · Chi Wang · Huamin Wang · Qi Wang · Michael Wu · Tim Wu · Xiuchao Wu · Jiamin Xu · Weiwei Xu · Matthew David Zane · Xin Zhang · Zihan Zhu · Changqing Zou
Papers in this thread
International Conference on 3D Vision (3DV), 2020
Uses differentiable path tracing—with global illumination effects like interreflection in the forward model—to refine a coarse mesh and its per-facet SVBRDF, so shading, shadow, and material are jointly disambiguated from images captured by phone and consumer 360 camera.
Transactions on Graphics (SIGGRAPH), 2022
Tiles a large indoor scene and assigns a small MLP per tile, with a separate view-dependent branch for reflections, so training distributes across GPUs and rendering stays interactive—on scenes over 100 square metres.
Transactions on Graphics (SIGGRAPH Asia), 2023
Pushes the tiled-NeRF idea into the bundle-adjusting regime: each tile carries a hash grid plus diffuse and specular MLPs, and ADMM reaches camera-pose consensus across tiles, with a specular-aware warping loss giving the poses a second optimisation path.
SIGGRAPH Asia Conference Papers, 2024
Targets curved-surface reflections and refractions—exactly where view-consistent global density models break—with a per-view Gaussian-mixture density along each ray, then warps and fuses these local volumes with learned blending weights for unstructured lumigraph rendering.
SIGGRAPH Asia Conference Papers, 2025
Replaces point lights with active area lighting during capture, then differentiates through linearly transformed cosines plus shadow visibility weighting for shading—recovering material at +3 dB relighting PSNR or matching point-light quality from a fifth of the photos.
Related papers
Computer Vision and Pattern Recognition (CVPR), 2024
Reconstructs scenes captured by a small-baseline circular sweep of a 360 camera by placing an SDF inside an adaptively subdivided spherical binoctree, whose geometry matches the capture setting and keeps memory in line with detail.
International Conference on 3D Vision (3DV), 2026
Tackles the sparse-view, wide-baseline regime where Gaussian splatting drops geometry—anchors it with two-view stereo, fills intermediate viewpoints via reprojection, and fuses in the gradient domain so colour transitions stay smooth across views.
MDPI Remote Sensing, 2026
Treats aerial Gaussian splatting as a measurement instrument: trunks span only a few pixels from altitude, so the method extracts a dense opacity-weighted point set, isolates trunk samples, and fits solid circles to estimate diameter-at-breast-height—4.79 cm RMSE, below a LiDAR baseline.