Real-Time Position-Aware View Synthesis from Single-View Input

1Mid Sweden University, Sweden 2Technical University of Berlin, Germany 3HTW Berlin - University of Applied Sciences, Germany

PVSNet Generates Novel Views from Single Image

Abstract

We present a lightweight, position-aware network designed for real-time novel view synthesis from a single input image. Unlike existing MPI or Image-Based methods which utlizie an explicit warping operator, we directly query the final views from the network itself. Our network inference is real-time making it suitable for fine-tuning on domain specific applications for live feeds.

Performance on In-the-wild images without fine tuninig

Light Field Reconstruction Example 1

Flowers Dataset

Ours Ground Truth

Light Field Reconstruction Example 2

Stanford Dataset

Ours Ground Truth

Results

Blender Dataset: Different View Synthesis Methods Against PVSNet

The visual comparison against TMPI, AdaPI, SinMPI on Blender Dataset.

Ours TMPI
Ours AdaMPI
Ours SinMPI

COCO Dataset: Different View Synthesis Methods Against PVSNet

The visual comparison against SinMPI, TMPI, AdaMPI on Blender Dataset.

Ours TMPI
Ours AdaMPI
Ours SinMPI

Ablation Study: Different Positional Embedding

The below example shows the NVS results when using different positional embedding methods.

Loading...

Supplementary Video



Related Links

There's a lot of excellent work that was introduced around the same time as ours.

NViST: In the Wild New View Synthesis from a Single Image with Transformers introduces an idea similar to as our embedding scheme.

FPS Rate on RTX 2070 Super

We compare the FPS rate on different resolutions against other methods when rendering end-to-end.

BibTeX

@article{gond2024real,
      title={Real-Time Position-Aware View Synthesis from Single-View Input},
      author={Gond, Manu and Zerman, Emin and Knorr, Sebastian and Sj{\"o}str{\"o}m, M{\aa}rten},
      journal={arXiv preprint arXiv:2412.14005},
      year={2024}
    }