Learning Efficient 3D Generation - Robotics Institute Carnegie Mellon University
Loading Events

PhD Speaking Qualifier

April

4
Fri
Hanzhe Hu PhD Student Robotics Institute,
Carnegie Mellon University
Friday, April 4
10:00 am to 11:00 am
GHC 6501
Learning Efficient 3D Generation
AbstractRecent advances in 3D generation have enabled the synthesis of multi-view images using large-scale pre-trained 2D diffusion models. However, these methods typically require dozens of forward passes, resulting in significant computational overhead. In this talk, we introduce Turbo3D, an ultra-fast text-to-3D system that generates high-quality Gaussian Splatting assets in under one second. Turbo3D features a streamlined architecture comprising a 4-step, 4-view diffusion generator and a lightweight feed-forward Gaussian reconstructor, both operating entirely in latent space. The 4-step, 4-view generator is a student model distilled through a novel Dual-Teacher approach, which encourages the student to learn view consistency from a multi-view teacher and photo-realism from a single-view teacher. By shifting the reconstructor’s input from pixel space to latent space, we eliminate decoding overhead and reduce transformer sequence length by half—significantly boosting efficiency. Turbo3D achieves superior 3D generation quality compared to prior methods, while operating at a fraction of their runtime.

Committee:
Shubham Tulsiani (Chair)
Deva Ramanan
Jun-Yan Zhu
Sheng-Yu Wang