We present Bayesian Diffusion Models (BDM), a prediction algorithm that performs effective Bayesian inference
by tightly coupling the top-down (prior) information with the bottom-up (data-driven) procedure via joint
diffusion processes. We show the effectiveness of BDM on the 3D shape reconstruction task. Compared to
prototypical deep learning data-driven approaches trained on paired (supervised) data-labels (e.g.
image-point clouds) datasets, our BDM brings in rich prior information from standalone labels (e.g. point
clouds) to improve the bottom-up 3D reconstruction. As opposed to the standard Bayesian frameworks where
explicit prior and likelihood are required for the inference, BDM performs seamless information fusion via
coupled diffusion processes with learned gradient computation networks. The specialty of our BDM lies in its
capability to engage the active and effective information exchange and fusion of the top-down and bottom-up
processes where each itself is a diffusion process. We demonstrate state-of-the-art results on both
synthetic and real-world benchmarks for 3D shape reconstruction.
Inference: Take turns to forward the steps of the two diffusion models and fuse.
Blue Step:Reconstructive Denoising Step P(yt-1 | yt, x)
The normal diffusion-based 3D reconstruction model.
Given the pointcloud yt at timestep t and the image x, get the previous step result yt-1.
Red Step:Generative Denoising Step P(yt-1 | yt)
The diffusion-based 3D generative model.
Given the pointcloud yt at timestep t, get the previous step result yt-1.
We implement our fusion module in two ways, merging and blending. In BDM-Merging, we finetune the decoder of reconstruction model using the same data of training reconstruction model. In BDM-Blending, we choose points randomly from these two groups then forming new pointclouods.
Experiments
I. Experiments on Synthetic Dataset: ShapeNet-R2N2
We train both our generation model and reconstruction model using ShapeNet-R2N2.
I. Experiments on Real-world Dataset: Pix3D
To demonstrate the generalizability of prior model, we run experiments on a different setting. We train our generation model using ShapeNet-R2N2 while train reconstruction model using Pix3D, which is a real-world dataset. We run our experiments on three categories: chair, table and sofa.
Citation
@misc{xu2024bayesian,
title={Bayesian Diffusion Models for 3D Shape Reconstruction},
author={Haiyang Xu and Yu Lei and Zeyuan Chen and Xiang Zhang and Yue Zhao and Yilin Wang and Zhuowen Tu},
year={2024},
eprint={2403.06973},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Paper
Bayesian Diffusion Models for 3D Shape Reconstruction