DecentNeRFs

Abstract

Neural radiance fields (NeRFs) show potential for transforming images captured worldwide into immersive 3D visual experiences. However, most of this captured visual data remains siloed in our camera rolls as these images contain personal details. Even if made public, the problem of learning 3D representations of billions of scenes captured daily in a centralized manner is computationally intractable. Our approach, DecentNeRF, is the first attempt at decentralized, crowd-sourced NeRFs that require ∼10⁴× less server computing for a scene than a centralized approach. Instead of sending the raw data, our approach requires users to send a 3D representation, distributing the high computation cost of training centralized NeRFs between the users. It learns photorealistic scene representations by decomposing users' 3D views into personal and global NeRFs and a novel optimally weighted aggregation of only the latter. We validate the advantage of our approach to learn NeRFs with photorealism and minimal server computation cost on structured synthetic and real-world photo tourism datasets. We further analyze how secure aggregation of global NeRFs in DecentNeRF minimizes the undesired reconstruction of personal content by the server.

Approach

Personal and global MLPs are trained on user devices to separate personal and global content from local images. After each training round, the server performs a federation of users' global MLPs. The contribution weightage from each user is learned to improve the visual fidelity of global content. Additionally, using a secure MPC protocol helps prevent the server from accessing individual users' global MLPs.

Comparison with existing methods

DecentNeRF can learn photorealistic renderings of scenes with ∼10⁴× less server compute than a centralized approach. The reconstruction quality of the global content is on par with a centralized baseline like NeRF-W, and significantly better than a naive decentralized approach such as Federated NeRF. Notably compared other approaches, our reconstructions do not exhibit blobs or artifacts in places where personal content was originally present in the source images.

Separation of Personal and Global Content

DecentNeRF can separate global and personal content effectively. Initially, we observe that personal content is present in the users' global MLPs. However, over multiple rounds of federation, our two-MLP approach gradually shifts the personal content onto the users' personal MLPs. This separation process results in a significant improvement in reconstruction quality at the server.

Cite

          @article{tasneem2024decentnerfs,
            title={DecentNeRFs: Decentralized Neural Radiance Fields from Crowdsourced Images},
            author={Tasneem, Zaid and Dave, Akshat and Singh, Abhishek and Tiwary, Kushagra and Vepakomma, Praneeth and Veeraraghavan, Ashok and Raskar, Ramesh},
            journal={arXiv preprint arXiv:2403.13199},
            year={2024}
          }

ECCV 2024

DecentNeRFs: Decentralized Neural Radiance Fields
from Crowdsourced Images

Paper

Supplementary Material

Youtube Video

Invited Talk

Abstract

Approach

Comparison with existing methods

Separation of Personal and Global Content

Cite