James Tompkin

Associate Professor

Visual Computing

 BlueSky @brownvc.bsky.social
 Github @brownvc

Brown student researcher?
Group Onboarding Process

Contact


 BlueSky @jamestompkin.bsky.social
Google Scholar

Office hours: Weds 1300 EST
Book appointment

Brown folks: Save an email,
use GCal 'Find a Time'
and include an agenda. Instructions

Center for Information Technology
Room 547
115 Waterman Street
Providence, RI, 02912


Acknowledgements

My intrepid collaborators and co-authors.

Funding:

  • US NSF, DARPA, NASA
  • UK EPSRC, BBC
  • Industry Activision, Adobe, Amazon, Cognex, Google, Intel, Meta, Snap, AI Foundation

The open source Web com­munity: HTML5 Boiler­plate, Ryan Johnston, Joshua N. Hibbert, Practical­Typo­graphy.com, EB Gara­mond.

Hosted on GitHub Pages using Jekyll — basic theme by orderedlist.

James Tompkin

Associate Professor

Visual Computing

 BlueSky @brownvc.bsky.social
 Github @brownvc

Brown student researcher?
Group Onboarding Process

Contact


 BlueSky @jamestompkin.bsky.social
Google Scholar

Office hours: Weds 1300 EST
Book appointment

Brown folks: Save an email,
use GCal 'Find a Time'
and include an agenda. Instructions

Center for Information Technology
Room 547
115 Waterman Street
Providence, RI, 02912


Acknowledgements

My intrepid collaborators and co-authors.

Funding:

  • US NSF, DARPA, NASA
  • UK EPSRC, BBC
  • Industry Activision, Adobe, Amazon, Cognex, Google, Intel, Meta, Snap, AI Foundation

The open source Web com­munity: HTML5 Boiler­plate, Ryan Johnston, Joshua N. Hibbert, Practical­Typo­graphy.com, EB Gara­mond.

Hosted on GitHub Pages using Jekyll — basic theme by orderedlist.


I am a visual computing researcher—computer vision, computer graphics, and human-computer interaction. My lab develops techniques for image and video creation, editing, analysis, and interaction. This requires image and scene reconstruction techniques, especially from multi-camera systems and for complex dynamic scenes, and with applications on 2D, multi-view, and VR/AR displays.



Practical Scene & Object Reconstruction
How do we push neural reconstruction toward production-quality at the scales and fidelities real graphics applications need — large scenes, unstructured camera arrays, physically-faithful shading?
2020–now

Practical Scene & Object Reconstruction

How do we push neural reconstruction toward production-quality at the scales and fidelities real graphics applications need — large scenes, unstructured camera arrays, physically-faithful shading?

Neural scene representations (NeRFs, Gaussian splats) capture stunning visual fidelity, but typically work on small bounded scenes captured under controlled conditions. Pushing the same machinery toward production graphics requires solving practical bottlenecks: scaling to large indoor and outdoor scenes, handling unstructured (non-rig) capture, integrating physically-correct light transport, and stabilising the optimisation when geometry is otherwise ambiguous.

A long collaboration with Weiwei Xu and Hujun Bao at Zhejiang has pushed neural reconstruction at one bottleneck per year — distributed tile-MLPs for large indoor scenes (SNISR, SIGGRAPH 2022), bundle-adjusting NeRFs with ADMM consensus over tiles at large scale (ScaNeRF, SIGGRAPH Asia 2023), local Gaussian density mixtures for unstructured capture with curved-surface reflections (LGDM, SIGGRAPH Asia 2024), and differentiable area-light shading for material recovery (EOR, SIGGRAPH Asia 2025). Shape from Tracing (3DV 2020) sits at the head of the line: an early step that used differentiable path tracing — full global illumination, not just shading — as the forward model for joint geometry and SVBRDF recovery.

Authors

Hujun Bao · Bach-Thuan Bui · Dongyoung Choi · Jaemin Cho · Loudon Cohen · Zheng Dong · Michael Fairley · Yaoan Gao · Purvi Goel · James Guesman · Hyunho Ha · Qixing Huang · Hyeonjoong Jang · Woohyun Kang · Hakyeong Kim · Min H. Kim · Andreas Meuleman · Minh-Hieu Nguyen · Yifan Peng · Daniel Ritchie · Belal Shaheen · Yujun Shen · Shubham · Vikas Thamizharasan · Chi Wang · Huamin Wang · Qi Wang · Michael Wu · Tim Wu · Xiuchao Wu · Jiamin Xu · Weiwei Xu · Matthew David Zane · Xin Zhang · Zihan Zhu · Changqing Zou

Papers in this thread

International Conference on 3D Vision (3DV), 2020
Uses differentiable path tracing — with global illumination effects like interreflection in the forward model — to refine a coarse mesh and its per-facet SVBRDF, so shading, shadow, and material are jointly disambiguated from images captured by phone and consumer 360 camera.
ACM Transactions on Graphics (SIGGRAPH), 2022
Tiles a large indoor scene and assigns a small MLP per tile, with a separate view-dependent branch for reflections, so training distributes across GPUs and rendering stays interactive — at scenes over 100 square meters.
ACM Transactions on Graphics (SIGGRAPH Asia), 2023
Pushes the tiled-NeRF idea into the bundle-adjusting regime: each tile carries a hash grid plus diffuse and specular MLPs, and ADMM reaches camera-pose consensus across tiles, with a specular-aware warping loss giving the poses a second optimization path.
SIGGRAPH Asia, 2024
Targets curved-surface reflections and refractions — exactly where view-consistent global density models break — with a per-view Gaussian-mixture density along each ray, then warps and fuses these local volumes with learned blending weights for unstructured lumigraph rendering.
SIGGRAPH Asia, 2025
Replaces point lights with active area lighting during capture, then differentiates through linearly transformed cosines plus shadow visibility weighting for shading — recovering material at +3 dB relighting PSNR or matching point-light quality from a fifth of the photos.

Related papers

Computer Vision and Pattern Recognition (CVPR), 2024
Reconstructs scenes captured by a small-baseline circular sweep of a 360 camera by placing an SDF inside an adaptively subdivided spherical binoctree, whose geometry matches the capture setting and keeps memory in line with detail.
3D Vision, 2026
Tackles the sparse-view, wide-baseline regime where Gaussian splatting drops geometry — anchors it with two-view stereo, fills intermediate viewpoints via reprojection, and fuses in the gradient domain so color transitions stay smooth across views.
TreeDGS: Aerial Gaussian Splatting for Distant DBH Measurement
MDPI Remote Sensing, 2026
Treats aerial Gaussian splatting as a measurement instrument: trunks span only a few pixels from altitude, so the method extracts a dense opacity-weighted point set, isolates trunk samples, and fits solid circles to estimate diameter-at-breast-height — beating a LiDAR baseline at 4.79 cm RMSE.
Monocular Dynamic 3D Reconstruction
When the input is only ordinary RGB video — no depth sensor, no rig — can we recover dynamic 3D scene geometry well enough to compete with depth-sensor-supervised methods?
2023–now

Monocular Dynamic 3D Reconstruction

When the input is only ordinary RGB video — no depth sensor, no rig — can we recover dynamic 3D scene geometry well enough to compete with depth-sensor-supervised methods?

Monocular dynamic 3D reconstruction takes a single moving camera observing a deforming scene and tries to recover a complete 4D representation — geometry, appearance, motion — over the captured time window. The problem is fundamentally under-constrained at any one instant, and progress depends on how well the chosen scene representation and the supervision signals work together.

Yiqing Liang's PhD has driven this arc. Starting from semantic attention flow fields built atop a dynamic NeRF at ICCV 2023, the work moved to a forward-warping Gaussian deformation formulation (GauFRe, with Meta colleagues) for real-time rendering, then to a TMLR benchmark (MonoDyGauBench) that puts the recent flood of monocular dynamic Gaussian methods on a like-for-like footing. The latest paper (Zero-MSF, with NVIDIA) abandons per-scene optimization entirely and trains a feed-forward predictor for scene flow that generalizes zero-shot to in-the-wild video.

Authors

Abhishek Badki · Orazio Gallo · Leonidas J. Guibas · Adam Harley · Numair Khan · Eliot Laidlaw · Douglas Lanman · Yiqing Liang · Runfeng Li · Zhengqin Li · Alexander Meyerowitz · Thu Nguyen-Phuoc · Mikhail Okunev · Srinath Sridhar · Hang Su · Mikaela Angelina Uy · Lei Xiao

Papers in this thread

International Conference on Computer Vision (ICCV), 2023
Reconstructs a 4D neural volume carrying not just color and density but also scene flow, semantics, and attention, then uses the latter two to decompose foreground objects from background across spacetime without supervision.
arXiv (Dec.~2023) + WACV, 2025
Casts monocular dynamic reconstruction as a canonical Gaussian template plus a forward-warping deformation field, with a separate static component initialized to absorb non-moving regions so the deformation focuses on what actually moves. Trains in roughly twenty minutes and renders in real time.
Transactions on Machine Learning Research, 2025
An apples-to-apples benchmark of monocular dynamic Gaussian splatting methods, categorized by motion representation. Method differences are resolvable on synthetic data but get swamped by real-world scene complexity, and the optimization is uniformly brittle.
Computer Vision and Pattern Recognition (CVPR), 2025
A feed-forward model that jointly predicts geometry and scene flow, trained on a one-million-sample synthetic recipe. Generalizes zero-shot to casual DAVIS video and RoboTAP manipulation scenes — no per-scene optimization required.
Active Illumination for Dynamic 3D Reconstruction
Physical modelling of active illumination from raw sensor measurements can improve scene estimation and avoid errors from derived depth.
2021–now

Active Illumination for Dynamic 3D Reconstruction

Physical modelling of active illumination from raw sensor measurements can improve scene estimation and avoid errors from derived depth.

Time-of-flight and structured-light cameras are typically used as depth sensors: their raw measurements are processed into a per-pixel depth map, and downstream reconstruction methods treat that depth as input. But, depth processing often simplifies scene assumptions, creating noise in low-reflectance regions, flying pixels with multi-path interference, and motion artifacts in fast-moving scenes from requiring multiple illumination readings for depth estimates. Further, it is difficult to integrate these raw measurements with other sensor modalities, like colour cameras.

Thread overview diagram

This line of work rethinks reconstruction for heterogeneous multi-shot imaging processes. Built upon a differentiable forward model of how the active illumination produces the raw sensor output for a given scene, these methods optimise a 4D volumetric scene representation (like NeRF or 3DGS) so that rendered measurements match what the sensor captured. This lets us principally integrate sensor measurements over spacetime, including across modalities, to reduce noise, resolve ambiguities in multi-shot sensing, and improve robustness to multi-path interference. And, as we model motion over time, then we can estimate and resample fast motion like swinging baseball bats to slow motion.

Authors

Benjamin Attal · Anh Duong · Aaron Gokaslan · Zixuan Guo · Changil Kim · Hakyeong Kim · Min H. Kim · Eliot Laidlaw · Runfeng Li · Marc Mapeke · Andreas Meuleman · Matthew O'Toole · Mikhail Okunev · Christian Richardt · Aarrushi Shandilya

Papers in this thread

Advances in Neural Information Processing Systems (NeurIPS), 2021
Established that a 4D scene can be supervised directly by continuous-wave ToF phasor measurements rather than processed depth, with added color cameras, showing low noise, superresolution, and better multi-path handling.
European Conference on Computer Vision (ECCV), 2024
Adds motion vectors that are jointly estimated with geometry. Uses four raw frames (not phasors) captured over time from a continuous-wave ToF sensor to create a coherent dynamic reconstruction. 20× less depth error on dynamic objects than the C-ToF baseline.
Computer Vision and Pattern Recognition (CVPR), 2025
Applies raw ToF supervision to a Gaussian splatting backbone, with two heuristics that stabilise the otherwise-brittle 3DGS optimisation when depth is not directly measured. Comparable quality to neural volumetric baselines while training ~100× faster.

Related papers

International Conference on Computer Vision (ICCV), 2023
Carries the supervision-by-raw-measurement approach from ToF over to structured light, and lets us separate direct and ambient illumination. Recovers higher-fidelity depth on objects than commodity structured light sensors, including for partially-transparent surfaces.
European Conference on Computer Vision (ECCV), 2022
Fuses ToF depth with stereo from a smartphone's optically-stabilised main RGB camera, where the floating lens has unknown pose. Self-calibrates the multi-sensor geometry from a single snapshot, then fuses via a correlation volume.
Controllable Generative Models
How do we efficiently control generative models to produce what we want — preserving identity, 3D structure, style — without sacrificing quality?
2018–2024

Controllable Generative Models

How do we efficiently control generative models to produce what we want — preserving identity, 3D structure, style — without sacrificing quality?

A generative model that can sample new content is impressive; one that produces exactly what a user has in mind is useful. Controlling generation requires aligning the model's latent structure with axes a person can articulate — identity, pose, style, lighting, geometry — without sacrificing the photorealism that brought the model to relevance in the first place. There is usually a quality-versus-control tradeoff to manage.

The thread runs from Youssef Mejjati's PhD work on unsupervised attention for image-to-image translation, through compositional controls (object stamps, GaussiGAN's 3D Gaussian primitives from silhouettes alone), into 3DMM-conditioned face generation where Yiwen Huang's PhD now sits. Two recent moves matter: TaxFreeGAN closes the FID gap to unconditional StyleGAN under 3DMM conditioning, and the disentangling-3D work shows that the noise in CLIP's embedding space — not the disentanglement strategy — is what kills quality. R3GAN sits alongside this arc as the architectural reset: a principled relativistic loss that lets the modern GAN drop its bag of tricks.

Authors

Akin Caliskan · Darren Cosker · Aaron Gokaslan · Yiwen Huang · Berkay Kicanaoglu · Hyeongwoo Kim · Kwang In Kim · Atsunobu Kotani · Volodymyr Kuleshov · Youssef A. Mejjati · Isa Milefchik · Christian Richardt · Zejiang Shen · Michael Snower · Stefanie Tellex · Vikas Thamizharasan · Oliver Wang · Yue Wang · Xinjie Yi · Zhiqiu Yu · Qian Zhang

Papers in this thread

Unsupervised Attention-guided Image to Image Translation
Neural Information Processing Systems (NeurIPS), 2018
Jointly trains attention with generators and discriminators so unsupervised image-to-image translation can localize edits to objects without disturbing background or inter-object structure.
European Conference on Computer Vision (ECCV), 2020
Factors handwriting style into separate character-level and writer-level descriptors, letting the model generate new characters in a held-out writer's hand from only a few samples.
CVPR Workshop on AI for Content Creation, 2020
Splits conditional object insertion into a mask generator (shape, given a class and bounding box) and a texture generator (appearance, conditioned on the background), so the inserted object is both diverse in shape and consistent with its surroundings.
BMVC 2021 and CVPR Workshop on AI for Content Creation, 2021
Learns a coarse 3D object representation as a set of self-supervised anisotropic 3D Gaussians from unposed 2D masks alone, then uses it to drive controllable mask and texture synthesis with interactive posing.
Learning Physically-based Face Material and Lighting Decomposition
International Conference on Computational Visual Media, 2022
Estimates per-portrait surface normals, albedo, roughness, and a high-frequency lighting map, and decomposes diffuse and specular reflectance — so a downstream editor can relight a face from a single photograph.
Winter Conference on Applications of Computer Vision (WACV) and AI for Content Creation (AI4CC) @ CVPR 2023, 2024
Formalizes 3DMM-conditioned face generation as a math problem, then applies targeted fixes that close the FID gap to unconditional StyleGAN — so controllability no longer costs visible image quality.
Disentangling 3D from Large Vision-Language Models for Controlled Portrait Generation
2024
Disentangles 3D portrait generation from a frozen CLIP plus a FLAME morphable model, then identifies CLIP's noisy embedding directions as the residual source of entanglement and damps them with a stochastic Jacobian regularizer.
The GAN is Dead; Long Live the GAN! A Modern GAN Baseline
Neural Information Processing Systems (NeurIPS), 2024
A regularized relativistic GAN loss with proven local convergence lets a minimalist StyleGAN2-derived architecture — stripped of the usual stabilization tricks — beat StyleGAN2 on FFHQ, ImageNet, CIFAR, and Stacked MNIST, and compete with diffusion models.
Light Fields — from Display to 4D Algorithms
The light field is a 4D record of a scene's rays — how do we present it to humans, interact with it, and process it computationally?
2012–2021

Light Fields — from Display to 4D Algorithms

The light field is a 4D record of a scene's rays — how do we present it to humans, interact with it, and process it computationally?

A light field captures the radiance at every point in space, in every direction — a 4D function that fully describes how light fills a scene. Captured light fields enable refocusing, depth recovery, and parallax view synthesis; displayed light fields offer glasses-free 3D. The challenge is data density: 4D content stresses capture devices, display hardware, and processing pipelines.

Two sub-arcs sit in this thread. The first (2012–2015) targets light field displays — an Emerging Technologies demo of painting directly into a glasses-free 3D display, content-adaptive lenticular prints that reshape the lenslet array to the captured light field, and a UIST paper that turns that lenslet array into a joint display-and-pen-input surface. The second (2019–2021), led by Numair Khan with Min H. Kim at KAIST, develops dense algorithms over captured 4D content: view-consistent superpixels via epipolar-plane image segmentation, edge-aware bidirectional diffusion for depth, and a differentiable diffusion routine for sparse-to-dense depth from multi-view images.

Authors

Marc Alexa · Simon Heinzle · Stanislav Jakuschevskij · Lucas Kasser · Jan Kautz · Numair Khan · Min H. Kim · Wojciech Matusik · James McCann · Jim McCann · Samuel Muff · Hanspeter Pfister · Henry Stone · Qian Zhang

Papers in this thread

SIGGRAPH Emerging Technologies, 2012
An early SIGGRAPH Emerging Technologies demo of a dual-purpose lenslet array that both displays a light field and senses a 3D light-pen position — the live precursor to the UIST 2015 write-up.
ACM Transactions on Graphics (SIGGRAPH), 2013
Treats the lenslet array as something to optimise rather than fix in advance — given an input light field, solve for lenslet size, shape, and arrangement that trade spatial against angular resolution where it matters. Validated by 3D printing the resulting arrays.
User Interface Software and Technology (UIST), 2015
One lenslet array does double duty — light field output and 5D pen input (3D position plus 2D orientation) at 150 Hz, with millimetre-scale accuracy. The display surface and the input surface are the same surface.
International Conference on Computer Vision (ICCV), 2019
Segments horizontal and vertical EPIs first, then clusters and propagates across all sub-aperture views — so superpixels stay consistent and respect occlusion as the viewpoint shifts, rather than being propagated outward from a single central view.
Computer Vision and Pattern Recognition (CVPR), 2021
Dense depth is obtained by diffusing a sparse set of points whose positions, depths, and weights are differentiably optimised through Gaussian splatting against a multi-view RGB reprojection loss. Scales to the 50k+ points needed for non-trivial scenes.
BMVC, 2021
A pair of BMVC papers that estimate 4D depth from sparse EPI-derived edges and diffuse them outward — the 2021 paper separates depth from texture edges via bidirectional diffusion, and the 2020 paper propagates the central-view depth to every other sub-aperture view in an occlusion-aware way.
Editing Video by Recovering Scene Structure
How do we let users edit captured video meaningfully — by first recovering the scene structure (moving objects, lighting vs. reflectance, cross-frame consistency) that makes plausible modifications possible?
2011–2017

Editing Video by Recovering Scene Structure

How do we let users edit captured video meaningfully — by first recovering the scene structure (moving objects, lighting vs. reflectance, cross-frame consistency) that makes plausible modifications possible?

Editing video is harder than editing a photograph: changes to one frame must propagate consistently to every other, and many edits (removing a person, separating lighting from material, stabilising flicker) require understanding the underlying scene rather than just manipulating pixels. The papers in this thread approach editing as inverse reconstruction: decompose video into scene structure first, then edit.

A postdoc-era thread spanning UCL, MPI-Inf, Harvard, and LIRIS-CNRS. The earliest piece (2011, UCL) is the cinemagraphs authoring tool — a moment image isolated from a stabilised clip. Miguel Granados led the video-inpainting work at MPI-Inf (2012) — removing dynamic objects from crowded scenes, and the harder case of background recovery under a free-moving camera. Nicolas Bonneel led the consistency and decomposition line (2014–2017) — interactive intrinsic decomposition, blind temporal consistency stabilising any per-frame filter, and the spatio-temporal extension to camera arrays. The 2016 multicut paper takes a different angle on the same theme: cut the video into the right regions before editing.

Authors

Bjoern Andres · Nicolas Bonneel · Miguel Granados · Oliver Grau · Jan Kautz · Kwang In Kim · Steffen Kirchhoff · Evgeny Levinkov · Sylvain Paris · Fabrizio Pece · Hanspeter Pfister · Kartic Subr · Kalyan Sunkavalli · Deqing Sun · Christian Theobalt · Oliver Wang

Papers in this thread

European Conference on Visual Media Production (CVMP), 2011
An authoring tool that pipelines stabilisation, segmentation, motion selection, and loop detection to produce cinemagraphs — short looping clips where only a chosen region moves.
Computer Graphics Forum (Eurographics), 2012
Object removal from crowded scenes by filling the spatio-temporal hole from other regions of the video where the occluded background was visible, posed as a graph-cut optimisation. Pitched at occlusions harder than previous work had attempted.
European Conference on Computer Vision (ECCV), 2012
Inpaints background revealed by removing dynamic objects from a free-moving-camera video by aligning candidate frames with piecewise planar homographies — sidestepping the full per-frame depth and pose recovery that earlier free-camera methods required.
ACM Transactions on Graphics (SIGGRAPH Asia), 2014
Decomposes video into reflectance and illumination via a hybrid L2-Lp gradient split, fast enough (two orders of magnitude over prior tools) to support interactive refinement and lighting-aware compositing.
ACM Transactions on Graphics (SIGGRAPH Asia), 2015
A gradient-domain post-process that stabilises any per-frame filter against flicker by borrowing temporal regularity from the unprocessed video — agnostic to what the filter actually is. Demonstrated across stylisation, intrinsic decomposition, and depth.
Pacific Graphics 2016 (Short Paper), 2016
Interactive multi-label video segmentation from multi-coloured scribbles, posed as a multicut on a supervoxel graph and solved fast enough to feel responsive. Multiple objects cut at once with consistent spatio-temporal boundaries, rather than chained binary segmentations.
Computer Graphics Forum (Eurographics), 2017
Extends the blind-consistency idea from time to time-and-space across stereo, light field, and wide-baseline rigs, and adds a filter-transfer scheme that runs the expensive filter on a small subset of frames and propagates the effect — an order-of-magnitude saving for camera-array data.
Dongyoung Choi, Jaemin Cho, Woohyun Kang, Hyunho Ha, James Tompkin, Min H. Kim
3D Vision, 2026
Project webpage Supplemental video
Belal Shaheen, Minh-Hieu Nguyen, Bach-Thuan Bui, Shubham, Tim Wu, Michael Fairley, Matthew David Zane, Michael Wu, James Tompkin
MDPI Remote Sensing, 2026
PDF arXiv
Yaoan Gao, Jiamin Xu, James Tompkin, Qi Wang, Zheng Dong, Hujun Bao, Yujun Shen, Huamin Wang, Changqing Zou, Weiwei Xu
SIGGRAPH Asia, 2025
Project webpage PDF Supplemental video
Ji Won Chung, Tongyu Zhou, Ivy Chen, Kevin Hsu, Ryan A. Rossi, Alexa Siu, Shunan Guo, Franck Dernoncourt, James Tompkin, Jeff Huang
2025
PDF arXiv
Yiqing Liang, Abhishek Badki, Hang Su, James Tompkin, Orazio Gallo
Computer Vision and Pattern Recognition (CVPR), 2025
Computer Vision and Pattern Recognition (CVPR), 2025
Xiuchao Wu, Jiamin Xu, Chi Wang, Yifan Peng, Qixing Huang, James Tompkin, Weiwei Xu
SIGGRAPH Asia, 2024
Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, James Tompkin
Neural Information Processing Systems (NeurIPS), 2024
Yiqing Liang, Mikhail Okunev, Mikaela Angelina Uy, Runfeng Li, Leonidas J. Guibas, James Tompkin, Adam Harley
Transactions on Machine Learning Research, 2025
European Conference on Computer Vision (ECCV), 2024
Hojung Kwon, Yuanbo Li, Xiaohan Ye, Praccho Muna-McQuay, Liuren Yin, James Tompkin
Transactions on Visualization and Computer Graphics (IEEE Visualization short paper), 2024
Project webpage PDF Slides Presentation video
Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang, James Tompkin, Min H. Kim
Computer Vision and Pattern Recognition (CVPR), 2024
Yiwen Huang, Akin Caliskan, Berkay Kicanaoglu, James Tompkin, Hyeongwoo Kim
2024
PDF arXiv
Gives insight into why disentangling with CLIP is difficult—it's the prompt noise!
Yiwen Huang, Zhiqiu Yu, Xinjie Yi, Yue Wang, James Tompkin
Winter Conference on Applications of Computer Vision (WACV) and AI for Content Creation (AI4CC) @ CVPR 2023, 2024
Project webpage PDF Slides arXiv
Yiqing Liang, Numair Khan, Zhengqin Li, Thu Nguyen-Phuoc, Douglas Lanman, James Tompkin, Lei Xiao
arXiv (Dec.~2023) + WACV, 2025
Xiuchao Wu, Jiamin Xu, Xin Zhang, Hujun Bao, Qixing Huang, Yujun Shen, James Tompkin, Weiwei Xu
ACM Transactions on Graphics (SIGGRAPH Asia), 2023
International Conference on Computer Vision (ICCV), 2023
International Conference on Computer Vision (ICCV), 2023
Project webpage PDF
International Journal of Computer Vision (IJCV), 2024
Project webpage PDF Slides
On Human-like Biases in CNNs for the Perception of Slant from Texture
Yuanhao Wang, Qian Zhang, Celine Aubuchon, Jovan Kemp, Fulvio Domini, James Tompkin
ACM Transactions on Applied Perception, 2023
Fumeng Yang, Yuxin Ma, Lane Harrison, James Tompkin, David H. Laidlaw
SIGCHI, 2023
Project webpage PDF
Learning Vector Quantized Shape Codes for Amodal Blastomere Instance Segmentation
Won-Dong Jang, Donglai Wei, Xingxuan Zhang, Brian Leahy, Helen Yang, James Tompkin, Dalit Ben-Yosef, Daniel Needleman, Hanspeter Pfister
IEEE International Symposium on Biomedical Imaging (ISBI), 2023
Xiuchao Wu, Jiamin Xu, Zihan Zhu, Hujun Bao, Qixing Huang, James Tompkin, Weiwei Xu
ACM Transactions on Graphics (SIGGRAPH), 2022
Yiheng Xie, Towaki Takikawa, Shunsuke Saito, Or Litany, Shiqin Yan, Numair Khan, Federico Tombari, James Tompkin, Vincent Sitzmann, Srinath Sridhar
Eurographics State of the Art Report + CVPR Tutorial + SIGGRAPH Course, 2022
Project webpage PDF
Andreas Meuleman, Hakyeong Kim, James Tompkin, Min H. Kim
European Conference on Computer Vision (ECCV), 2022
Project webpage PDF Presentation video
Hyun Jin Ku, Hyunho Ha, Joo Ho Lee, Dahyun Kang, James Tompkin, Min H. Kim
International Conference on Computational Photography (ICCP), 2022
Project webpage PDF Supplemental video Presentation video
Jing Qian, Qi Sun, Curtis Wigington, Han L. Han, Tong Sun, Jennifer Healey, James Tompkin, Jeff Huang
SIGCHI, 2022
Project webpage PDF
Donglai Wei, Siddhant Kharbanda, Sarthak Arora, Roshan Roy, Nishant Jain, Akash Palrecha, Tanav Shah, Shray Mathur, Ritik Mathur, Abhijay Kemka, Anirudh Chakravarthy, Zudi Lin, Won-Dong Jang, Yansong Tang, Song Bai, James Tompkin, Philip H.S. Torr, Hanspeter Pfister
Computer Vision and Pattern Recognition (CVPR), 2022
Beatrix-Emőke Fülöp-Balogh, Eleanor Tursman, James Tompkin, Nicholas Bonneel, Julie Digne
Computers and Graphics, 2022
PDF Supplemental video arXiv
Learning Physically-based Face Material and Lighting Decomposition
International Conference on Computational Visual Media, 2022
Also appeared at CVPR 2021 Workshop on AI for Content Creation
Fumeng Yang, James Tompkin, Lane Harrison, David H Laidlaw
Transactions on Visualization and Computer Graphics, 2022
Project webpage PDF
Hosted at the Open Science Foundation.
Advances in Neural Information Processing Systems (NeurIPS), 2021
Kwang In Kim, James Tompkin
International Conference on Computer Vision (ICCV), 2021
Project webpage Slides
Computer Vision and Pattern Recognition (CVPR), 2021
BMVC 2021 and CVPR Workshop on AI for Content Creation, 2021
Austin Sumigray, Eliot Laidlaw, James Tompkin, Stefanie Tellex
Human-Robot Interaction (Late Breaking Report), 2021
PDF
Michail Schwab, David Saffo, Nicholas Bond, Shash Sinha, Cody Dunne, Jeff Huang, James Tompkin, Michelle M. Borkin
Transactions on Visualization and Computer Graphics (TVCG), 2021
Project webpage PDF
European Conference on Computer Vision (ECCV), 2020
Atsunobu Kotani, Stefanie Tellex, James Tompkin
European Conference on Computer Vision (ECCV), 2020
BMVC, 2021
Fast 4D depth with accurate occlusion edges across two papers:
Edge-aware Bi-directional Diffusion for Dense Depth Estimation from Light Fields
and
View-consistent 4D Light Field Depth Estimation
International Conference on 3D Vision (3DV), 2020
Christian Richardt, James Tompkin, Gordon Wetzstein
Real VR—Immersive Digital Reality, 2020
PDF
Chapter in the Real VR — Immersive Digital Reality Springer book; DOI.
CVPR Workshop on AI for Content Creation, 2020
Linked PDF is the full 8-page paper; the CVPRW version is 4 pages.
Eleanor Tursman, Marilyn George, Seny Kamara, James Tompkin
CVPR Workshop on Media Forensics, 2020
Michail Schwab, David Saffo, Yixuan Zhang, Shash Sinha, Cristina Nita-Rotaru, James Tompkin, Cody Dunne, Michelle A. Borkin
Transactions on Visualization and Computer Graphics (IEEE Visualization), 2020
Salma A. Magid, Won-Dong Jang, Denis Schapiro, Donglai Wei, James Tompkin, Peter Sorger, Hanspeter Pfister
MICCAI, 2020
International Conference on Computer Vision (ICCV), 2019
This work also produces an occlusion-aware piecewise planar scene reconstruction as a byproduct!
Jing Qian, Jiaju Ma, Xiangyu Li, Benjamin Attal, Haoming Lai, James Tompkin, John Hughes, Jeff Huang
User Interface Software and Technology (UIST), 2019
VRCAI, 2019
PDF Supplemental video
Michail Schwab, Sicheng Hao, Olga Vitek, James Tompkin, Jeff Huang, Michelle A. Borkin
SIGCHI, 2019
Project webpage PDF Supplemental video Presentation video
Michail Schwab, James Tompkin, Jeff Huang, Michelle A. Borkin
Transactions on Visualization and Computer Graphics (IEEE Visualization short paper), 2019
One-line SVG pan/zoom, plus a pan/zoom injecting bookmark for any SVG! The project page hosts docs, jsFiddle, and bl.ocks.org examples.
Eric Rosen, David Whitney, Elizabeth Phillips, Gary Chien, James Tompkin, George Konidaris, Stefanie Tellex
International Journal of Robotics Research, 2019
Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim
Neural Information Processing Systems (NeurIPS), 2018
European Conference on Computer Vision (ECCV), 2018
Daniel Haehn, James Tompkin, Hanspeter Pfister
Transactions on Visualization and Computer Graphics (IEEE Visualization), 2018
Project webpage PDF
Project page bundles the paper, code, and data.
Daniel Haehn, Verena Kaynig, James Tompkin, Jeff W. Lichtman, Hanspeter Pfister
Computer Vision and Pattern Recognition (CVPR), 2018
Kwang In Kim, Juhyun Park, James Tompkin
Computer Vision and Pattern Recognition (CVPR), 2018
PDF
Alexandra Papoutsaki, Aaron Gokaslan, James Tompkin, Yuze He, Jeff Huang
ACM Symposium on Eye Tracking Research and Applications (ETRA), 2018
Project page bundles paper and code; dataset is hosted separately.
James Tompkin, Kwang In Kim, Hanspeter Pfister, Christian Theobalt
British Machine Vision Conference, 2017
Project webpage PDF
Nicolas Bonneel, James Tompkin, Deqing Sun, Oliver Wang, Kalyan Sunkavalli, Sylvain Paris, Hanspeter Pfister
Computer Graphics Forum (Eurographics), 2017
Project webpage PDF Supplemental video
We could have called it Blind Video Spatio-Temporal Consistency as it follows up Blind Video Temporal Consistency.
International Conference on Computer Vision (ICCV), 2017
PDF
Serena Booth, James Tompkin, Krzysztof Gajos, Jim Waldo, Hanspeter Pfister, Radhika Nagpal
Conference on Human-Robot Interaction (HRI), 2017
Project webpage PDF Supplemental video
Eric Rosen, David Whitney, Elizabeth Phillips, Gary Chien, James Tompkin, George Konidaris, Stefanie Tellex
International Symposium on Robotics Research, 2017
Lezhi Li, James Tompkin, Panagiotis Michalatos, Hanspeter Pfister
IEEE Visualization Workshop on Visual Analytics for Deep Learning, 2017
Project webpage PDF Slides Supplemental video
Daniel Haehn, John Hoffer, Brian Matejek, Adi Suissa-Peleg, Ali K. Al-Awami, Lee Kamentsky, Felix Gonda, Eagon Meng, William Zhang, Richard Schalek, Alyssa Wilson, Toufiq Parag, Johanna Beyer, Verena Kaynig, Thouis R. Jones, James Tompkin, Markus Hadwiger, Jeff W. Lichtman, Hanspeter Pfister
MDPI Informatics—Special Issue on Scalable Interactive Visualization, 2017
Project webpage PDF
Michail Schwab, Hendrik Strobelt, James Tompkin, Colin Fredericks, Connor Huff, Dana Higgins, Anton Strezhnev, Maya Komisarchik, Gary King, Hanspeter Pfister
Transactions on Visualization and Computer Graphics (IEEE Visualization), 2016
Project webpage PDF
Evgeny Levinkov, James Tompkin, Nicolas Bonneel, Steffen Kirchhoff, Bjoern Andres, Hanspeter Pfister
Pacific Graphics 2016 (Short Paper), 2016
PDF
James Tompkin, Samuel Muff, James McCann, Hanspeter Pfister, Jan Kautz, Marc Alexa, Wojciech Matusik
User Interface Software and Technology (UIST), 2015
Project webpage PDF Slides
Also at SIGGRAPH Emerging Technologies 2012: Interactive Light Field Painting
Helge Rhodin, James Tompkin, Kwang In Kim, Edilson de Aguiar, Hanspeter Pfister, Hans-Peter Seidel, Christian Theobalt
ACM Transactions on Graphics (SIGGRAPH Asia), 2015
Project webpage PDF Slides Supplemental video
Builds upon project: Direct Motion Mapping
Nicolas Bonneel, James Tompkin, Kalyan Sunkavalli, Deqing Sun, Sylvain Paris, Hanspeter Pfister
ACM Transactions on Graphics (SIGGRAPH Asia), 2015
Gaurav Bharaj, David I.W. Levin, James Tompkin, Yun Fei, Hanspeter Pfister, Wojciech Matusik, Changxi Zheng
ACM Transactions on Graphics (SIGGRAPH Asia), 2015
Project webpage PDF Supplemental video
Gaurav Bharaj, Stelian Coros, Bernhard Thomaszewski, James Tompkin, Bernd Bickel, Hanspeter Pfister
ACM Symposium on Computer Animation (SCA), 2015
Project webpage PDF Supplemental video
Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt
Computer Vision and Pattern Recognition (CVPR), 2015
Project webpage PDF
Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt
International Conference on Computer Vision (ICCV), 2015
Project webpage PDF
Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt
Computer Vision and Pattern Recognition (CVPR), 2015
Project webpage PDF
Nicolas Bonneel, Kalyan Sunkavalli, James Tompkin, Deqing Sun, Sylvain Paris, Hanspeter Pfister
ACM Transactions on Graphics (SIGGRAPH Asia), 2014
Project webpage PDF Slides Supplemental video
Younghee Kwon, Kwang In Kim, James Tompkin, Jin Hyung Kim, Christian Theobalt
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2014
Project webpage PDF
Fabrizio Pece, James Tompkin, Hanspeter Pfister, Jan Kautz, Christian Theobalt
European Conference on Visual Media Production (CVMP), 2014
PDF Supplemental video
Related project: Vidicontexts
Helge Rhodin, James Tompkin, Kwang In Kim, Kiran Varanasi, Hans-Peter Seidel, Christian Theobalt
Computer Graphics Forum (Eurographics), 2014
Project webpage PDF Slides Supplemental video
Related project: Generalized Wave Gestures
Miguel Granados, Kwang In Kim, James Tompkin, Christian Theobalt
ACM Transactions on Graphics (SIGGRAPH Asia), 2013
Project webpage PDF Supplemental video
Kwang In Kim, James Tompkin, Christian Theobalt
International Conference on Computer Vision (ICCV), 2013
James Tompkin, Fabrizio Pece, Rajvi Shah, Shahram Izadi, Jan Kautz, Christian Theobalt
User Interface Software and Technology (UIST), 2013
Project webpage PDF Slides Supplemental video
Related study into display device effect: Device Effect on Panoramic Video+Context Tasks
James Tompkin, Min H. Kim, Kwang In Kim, Jan Kautz, Christian Theobalt
ACM Transactions on Applied Perception (TAP), 2013
Project webpage PDF Slides Supplemental video
James Tompkin, Simon Heinzle, Jan Kautz, Wojciech Matusik
ACM Transactions on Graphics (SIGGRAPH), 2013
Project webpage PDF Slides Supplemental video
Printing light field displays with varying spatio-angular resolution.
EngD Thesis @ University College London, 2013
PDF
Philippe Levieux, James Tompkin, Jan Kautz
European Conference on Visual Media Production (CVMP), 2012
Project webpage PDF Supplemental video
Alt title: Light Field Video Textures
Miguel Granados, Kwang In Kim, James Tompkin, Jan Kautz, Christian Theobalt
European Conference on Computer Vision (ECCV), 2012
Project webpage PDF Supplemental video
Project page includes the dataset.
Kwang In Kim, James Tompkin, Martin Theobald, Jan Kautz, Christian Theobalt
European Conference on Computer Vision (ECCV), 2012
Useful for building correspondence graphs for image matching, e.g., in search or large-scale reconstruction. Supplemental material.
James Tompkin, Kwang In Kim, Jan Kautz, Christian Theobalt
ACM Transactions on Graphics (SIGGRAPH), 2012
Project webpage PDF Slides Supplemental video
James Tompkin, Samuel Muff, Stanislav Jakuschevskij, Jim McCann, Jan Kautz, Marc Alexa, Wojciech Matusik
SIGGRAPH Emerging Technologies, 2012
Project webpage PDF Slides Supplemental video
Early demo of our later UIST 2015 publication Joint 5D Pen Input for Light Field Displays. Demo project page also at MIT CDFG.
Miguel Granados, James Tompkin, Kwang In Kim, Oliver Grau, Jan Kautz, Christian Theobalt
Computer Graphics Forum (Eurographics), 2012
Project webpage PDF Supplemental video
Project page includes the dataset.
Henrik Lieng, James Tompkin, Jan Kautz
Computer Graphics Forum (Eurographics), 2012
Project page includes code and data.
Feng Xu, Yebin Liu, Carsten Stoll, James Tompkin, Gaurav Bharaj, Qionghai Dai, Hans-Peter Seidel, Jan Kautz, Christian Theobalt
ACM Transactions on Graphics (SIGGRAPH), 2011
Project webpage PDF
James Tompkin, Fabrizio Pece, Kartic Subr, Jan Kautz
European Conference on Visual Media Production (CVMP), 2011
Project webpage PDF Supplemental video
Beste F. Yuksel, Michael Donnerer, James Tompkin, Anthony Steed
International Brain-Computer Interface Conference (BCI), 2011
PDF
Beste F. Yuksel, Michael Donnerer, James Tompkin, Anthony Steed
ACM Transactions on Computer-Human Interaction (SIGCHI), 2010
Project webpage PDF Supplemental video
Jennifer G. Sheridan, James Tompkin, Abel Maciel, George Roussos
British HCI Group Annual Conference on People and Computers (BCS-HCI), 2009
Project webpage PDF Supplemental video
Webpage contains many projects and events! Schematics and WebGL model viewer!
MSci Dissertation @ King's College, London, 2006

Workshops and Courses

AI for Content Creation
CVPR 2019–2025 Workshop

Physics-inspired 3D Vision and Imaging
CVPR 2025 Workshop

Neural Fields Beyond Conventional Cameras
ECCV 2024 Workshop

Neural Fields in Visual Computing
CVPR 2022 Tutorial + SIGGRAPH 2023 Course

New England Compter Vision Symposium
Brown 2019

Video for Virtual Reality
SIGGRAPH 2017 Course

User-centric Computational Videography
SIGGRAPH 2015 Course

University Courses

CSCI 1430—Introduction to Computer Vision
Brown University
2016–now.

CSCI 1290—Computational Photography
Brown University
2018–now.

CSCI 2951-I—Computer Vision for Graphics and Interaction
Brown University
2016–now.

CSCI 2000—Computer Science Research Methods or How to be a CS PhD Student
Brown University
2021 Fall.

CSCI 1950-N—2D Game Engines
Brown University
2017–now. Mentoring student-led course.

GISP 0002—NFTs, Blockchain, and Art, led by Ally Zhu and Nikolas Lazar
Brown University
2022 Spring.

CS171—Visualization
Harvard University
2016 Spring, 2015 Spring.

Computer Vision for Computer Graphics
Max-Planck-Institute for Informatics
2013 Summer.

Doctoral Students

2025–
Yiwen (Nick) Huang
2024–
2022–
2021–
2021–2025
Onto: Luma AI Research Scientist
2018–
2016–2021
Onto: Meta Reality Labs Research Scientist

Masters Students

2023–2025
Onto: Rice PhD
2021–2024
Onto: Meta Reality Labs
Yiwen (Nick) Huang
2021–2023
Onto: Brown PhD
2020–2022
Onto: UMass Amherst PhD
2019–2021
Onto: UToronto PhD
2016–2020
Onto: US Congressional Innovation Scholar
2018–2020
Onto: Stanford PhD
2018–2020
Onto: Facebook AI on VR/AR
2018–2020
Onto: Google
2019–2020
Onto: Allen Institute for AI Residency; MIT PhD
2017–2019
Onto: CMU PhD
2017–2019
Onto: Facebook AI Residency, Cornell PhD

Undergraduate Students

Anika Bahl
2023–2024
Onto:
Troy Conklin
2022–2024
Onto: General Dynamics
2021–2023
Onto: CMU Research Masters | UW PhD
2022–2023
Onto: Harvard Data Science Masters
2021–2023
Onto: Harvard Computational Science and Engineering Masters
2020–2022
Onto: Common Sense Machines
2017–2020
Onto: UC Berkeley PhD
2019–2021
Onto: Common Sense Machines
Henry Stone
2018–2020
Lucas Kasser
2018–2019
2017–2018
Onto: Allen Institute for AI Residency; UWashington PhD

Extended Family PhDs

2022–2025
Onto:
2021–2025
Onto: Alibaba
2018–2021
Onto: Synthesia
2014–2020
Onto: Google
2014–2019
Onto: UMass Boston Faculty
2014–2017
Onto: RealityDefender

Biography (2024)

James Tompkin is an Associate Professor of Computer Science at Brown University. His research at the intersection of computer vision, computer graphics, and human-computer interaction helps develop new visual computing tools and experiences from cameras. For this, his lab creates techniques for 3D scene reconstruction from multi-camera systems and for dynamics. His doctoral work at University College London on large-scale video processing and exploration techniques led to creative exhibition work in the Museum of the Moving Image in New York City. Postdoctoral work at Max-Planck-Institute for Informatics and Harvard University helped create new methods to edit content within images and videos. Recent research has developed new techniques for low-level reconstruction of dynamic scenes, view synthesis for VR, and AI content editing and generation.

Academic lineage

Please find my research summary video from 2015—our newer lab work is on the 'Research' tab.



SIGGRAPH 50th—2023

I supported SIGGRAPH's 50th conference in 2023 as the chair of the Posters program, which was coincidentally running its 20th iteration too. Here's a meta-poster about the program's history and its outstanding contributors (low-res PNG).


Faculty and Tenure Application Materials

To share my experience, here is the material I sent to Brown to apply for a tenure-track assistant professor position in Dec. 2015.
CV (Sept. 2016)
Research Statement
Teaching Statement

Here is the material I used for my tenure case at Brown in Dec. 2023.
CV
Research Statement
Teaching Statement

Exhibitions

I supported the Discover program and club at Brown/RISD to pair arts and science students and put on an exhibition (2017--2021). I have also tried to contribute myself.

Bad Art @ Brown, 2018
with Aaron Gokaslan and Vivek Ramanujan

Rear Window Augmented
with Jeff Desom

Museum of the Moving Image
New York City
7th Nov. 2015 to 10th April 2016

ISCP
New York City
7–9th November 2014

Festival Imaginales
Epinal, France

26–29th May 2014

Luxembourg Film Festival
28th February to 9th March 2014