Recent progress in single-image 3D generation highlights the importance of multi-view coherency, leveraging 3D priors from large-scale diffusion models pretrained on Internet-scale images. However, the aspect of novel-view diversity remains underexplored within the research landscape due to the ambiguity in converting a 2D image into 3D content, where numerous potential shapes can emerge. Here, we aim to address this research gap by simultaneously addressing both consistency and diversity.
HarmonyView generates realistic 3D content using just a single image. It excels at maintaining visual and geometric consistency across generated views while enhancing the diversity of novel views, even in complex scenes.
HarmonyView synergistically guide the synchronization of noisy multi-views facilitating geometric coherency among clean multi-views. Thus, HarmonyView generate diverse instances with different random seeds.
HarmonyView outperforms state-of-the-art methods across all metrics in both novel view synthesis task and 3D reconstruction task. Notably, HarmonyView achieves the best results by a significant margin in 3D reconstruction task.
@misc{woo2023harmonyview,
title={HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D},
author={Sangmin Woo and Byeongjun Park and Hyojun Go and Jin-Young Kim and Changick Kim},
year={2023},
eprint={2312.15980},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This website is adapted from Nerfies, and LLaVA, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.