Bayesian Diffusion Models for 3D Shape Reconstruction

Haiyang Xu^1,*, Yu Lei^2,*, Zeyuan Chen³, Xiang Zhang³,

Yue Zhao⁴, Yilin Wang⁴, Zhuowen Tu^3,†

¹University of Science and Technology of China, ²Shanghai Jiao Tong University,

³UC San Diego, ⁴Tsinghua University

^*Equal Contribution, ^†Corresponding Author

Abstract

We present Bayesian Diffusion Models (BDM), a prediction algorithm that performs effective Bayesian inference by tightly coupling the top-down (prior) information with the bottom-up (data-driven) procedure via joint diffusion processes. We show the effectiveness of BDM on the 3D shape reconstruction task. Compared to prototypical deep learning data-driven approaches trained on paired (supervised) data-labels (e.g. image-point clouds) datasets, our BDM brings in rich prior information from standalone labels (e.g. point clouds) to improve the bottom-up 3D reconstruction. As opposed to the standard Bayesian frameworks where explicit prior and likelihood are required for the inference, BDM performs seamless information fusion via coupled diffusion processes with learned gradient computation networks. The specialty of our BDM lies in its capability to engage the active and effective information exchange and fusion of the top-down and bottom-up processes where each itself is a diffusion process. We demonstrate state-of-the-art results on both synthetic and real-world benchmarks for 3D shape reconstruction.

Method

Training: Two separately trained diffusion models: conditional (reconstruction) + unconditional (generative).

Inference: Take turns to forward the steps of the two diffusion models and fuse.

Blue Step: Reconstructive Denoising Step P(y^t-1 | y^t, x)
- The normal diffusion-based 3D reconstruction model.
- Given the pointcloud y^t at timestep t and the image x, get the previous step result y^t-1.
Red Step: Generative Denoising Step P(y^t-1 | y^t)
- The diffusion-based 3D generative model.
- Given the pointcloud y^t at timestep t, get the previous step result y^t-1.

We implement our fusion module in two ways, merging and blending. In BDM-Merging, we finetune the decoder of reconstruction model using the same data of training reconstruction model. In BDM-Blending, we choose points randomly from these two groups then forming new pointclouods.

Experiments

I. Experiments on Synthetic Dataset: ShapeNet-R2N2

We train both our generation model and reconstruction model using ShapeNet-R2N2.

I. Experiments on Real-world Dataset: Pix3D

To demonstrate the generalizability of prior model, we run experiments on a different setting. We train our generation model using ShapeNet-R2N2 while train reconstruction model using Pix3D, which is a real-world dataset. We run our experiments on three categories: chair, table and sofa.

Citation


        @misc{xu2024bayesian,
            title={Bayesian Diffusion Models for 3D Shape Reconstruction}, 
            author={Haiyang Xu and Yu Lei and Zeyuan Chen and Xiang Zhang and Yue Zhao and Yilin Wang and Zhuowen Tu},
            year={2024},
            eprint={2403.06973},
            archivePrefix={arXiv},
            primaryClass={cs.CV}
        }

Paper

Bayesian Diffusion Models for 3D Shape Reconstruction

Haiyang Xu^*, Yu Lei^*, Zeyuan Chen, Xiang Zhang, Yue Zhao, Yilin Wang, Zhuowen Tu

description arXiv version