I'm surprised at the lukewarm reception. Admittedly I don't follow the image-to-3D space as much, but last time I checked in, the gloopy fuzzy outputs did not impress me.
I want to highlight what I believe is the coolest innovation: their novel O-Voxel data structure. I'm still trying to wrap my head around how they figured out the conversion from voxel-space to mesh-space. Those two worlds don't work well together.
A 2D analogy is that they figured out an efficient, bidirectional, one-shot method of converting PNG's into SVG's, without iteration. Crazy.
TRELLIS 1 had a massive impact on the research in this area, not least because it’s actually open (full dataset, training and inference). Research like SynCity or PhysX-3D (not the NVIDIA one) wouldn’t have been possible.
Excites for the follow ups for this new generation.
State of the art for Open Source. This is a nice improvement, but far out I cannot wait for a Sparc3D equivalent for local use. Its a step change in quality. I really hope Hunyuan3D-3 is the one to level up to that quality now
The results from arbitrary pictures are not nearly as good as what's shown in the posting. So either the demo is running a gimped version of the model or the examples are _very_ handpicked.
If it takes 60 seconds on a GPU I can leave it running over night on a CPU. (And going off previous experience, it won't be even be that slow, I'm just being conservative.)
I want to highlight what I believe is the coolest innovation: their novel O-Voxel data structure. I'm still trying to wrap my head around how they figured out the conversion from voxel-space to mesh-space. Those two worlds don't work well together.
A 2D analogy is that they figured out an efficient, bidirectional, one-shot method of converting PNG's into SVG's, without iteration. Crazy.
reply