FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

02/04/2024 3 min

Listen "FlexiDreamer: Single Image-to-3D Generation with FlexiCubes"

Descargar episodio Ver en sitio original

Episode Synopsis

3D content generation from text prompts or single images has made remarkable progress in quality and speed recently. One of its dominant paradigms involves generating consistent multi-view images followed by a sparse-view reconstruction. However, due to the challenge of directly deforming the mesh representation to approach the target topology, most methodologies learn an implicit representation (such as NeRF) during the sparse-view reconstruction and acquire the target mesh by a post-processing extraction. Although the implicit representation can effectively model rich 3D information, its training typically entails a long convergence time. In addition, the post-extraction operation from the implicit field also leads to undesirable visual artifacts. In this paper, we propose FlexiDreamer, a novel single image-to-3d generation framework that reconstructs the target mesh in an end-to-end manner. By leveraging a flexible gradient-based extraction known as FlexiCubes, our method circumvents the defects brought by the post-processing and facilitates a direct acquisition of the target mesh. Furthermore, we incorporate a multi-resolution hash grid encoding scheme that progressively activates the encoding levels into the implicit field in FlexiCubes to help capture geometric details for per-step optimization. Notably, FlexiDreamer recovers a dense 3D structure from a single-view image in approximately 1 minute on a single NVIDIA A100 GPU, outperforming previous methodologies by a large margin.

More episodes of the podcast Tech Frontier

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching 05/04/2024

AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent 05/04/2024

Training LLMs over Neurally Compressed Text 05/04/2024

ReFT: Representation Finetuning for Language Models 05/04/2024

PointInfinity: Resolution-Invariant Point Diffusion Models 05/04/2024

Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? 05/04/2024

LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models 05/04/2024

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models 04/04/2024

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models 04/04/2024

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction 04/04/2024

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

Listen "FlexiDreamer: Single Image-to-3D Generation with FlexiCubes"

Episode Synopsis

More episodes of the podcast Tech Frontier

Internet as human right and its scope

Prevent Attacks From Your Local Area Network

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD