Hacker News new | past | comments | ask | show | jobs | submit login
Magic3D: High-Resolution Text-to-3D Content Creation (deepimagination.cc)
160 points by lab on Nov 21, 2022 | hide | past | favorite | 18 comments



Magic3D looks like an improvement on DreamFusion [1], so it's sad to see that the code and models are not being made public.

What is public right now is StableDreamFusion [2]. It produces surprisingly good results on radially symmetrical organic objects like flowers and pineapples. You can run it on your own GPU or a colab.

Or, if you just want to type a prompt into a website and see something in 3D, try our demo at https://holovolo.tv

[1] https://dreamfusion3d.github.io/

[2] https://github.com/ashawkey/stable-dreamfusion

[3] https://holovolo.tv


Looks like Magic3D doesn't depend on any additional training, which means that open-source methods like StableDreamFusion can be adapted to this new method quite easily.

They use https://deepimagination.cc/eDiffi/ as the text-to-image diffusion model, which can be replaced with Stable Diffusion or something else.


Note that [3] https://holovolo.tv does not generate a mesh, but pipes the generated image trough a depth map estimator to create a parallax effect.


the dreamfields which is the precursor to all this work is available: https://colab.research.google.com/drive/1u5-zA330gbNGKVfXMW5... the meshes are so so, resolution wise. I made a bunch of architecture studies with the dreamfields: http://delta.center/20102020-ar-platform#/architecture-ai3d/ . I can't wait until the new versions are public; they will be so useful. I did not get good results with stable dreamfusion; I thought the dreamfields were better.


Many moons ago my job was banging out greebles and low poly textured models for arch-viz (architectural visualisation) libraries. Think variations of "technological" shapes to provide detailing for a space ship, chair models to be used in background of render of fancy new office, or door variation number 26 type models. Yes it was as boring as it sounds. A lot of the models have ended up being used in games more recently as well as GPUs have got better. I can certainly see these types of systems being good enough to replace that type of drudge work in modelling pipelines within the next two years. What I really want to see (now that my livelihood doesn't depend on it) is a system that can produce models from prompts with decent mesh topology suitable for rigging (or even auto rigged) and that can be separated into component parts and even have physics applied to them. My dream would be for the bunny to be able to hop off, the stack of pancakes to react realistically and the maple syrup to ooze down!


It certainly seems that where there is sufficient incentive-pressure, evolutions like this will be made... and tool-chains will pop up... probably more quickly than one would think...

Such a fascinating and upside-down moment.


Can't wait for shovelware games that use nothing but these.

But also something like Scribblenauts where arbitrary 3d objects can be created by wizards.


IDK about you but I kinda miss the "1000 games on one disc" things we used to get from the backs of magazines.


Isn't that itch.io nowadays? Sort by popularity and filter for free, and you should be good to go. :)


Proud future founder of such a shovelware studio, I cannot wait!!


Great results!

Side note: I'd recommend avoiding 'magic' branding in AI technology because it's going to be outdated in a week


amazing stuff, I can't wait to use this in video. I've been having so much fun training custom stable diffusion models of people and testing people doing various things . the problem with these is that finding the right text in latent space, so im still not sure text is the right medium to generate everything, but instead as a tool to start prototyping.


I’m curious about how tools like this could intersect w skilled workflows in something like Rhino/Grasshopper.


As a grasshopper style node graph is effectively an abstract syntax tree (AST, i.e. code) I imagine that actually something like GitHub copilot and such tools would be more appropriate.

Current state of the art tends not to differentiate between gibberish and output of actual value so that may be a bit of a downer.

But I definetly see it as plausible - I just don't know where you would get the training data though...


Here's an example of someone already using multiple NeRF objects in Blender - https://twitter.com/jperldev/status/1594627208676605952


Is this trained on unlicensed 3D meshes (like Stable Diffusion is trained on unlicensed images)?


Neither, read the paper


Besides the name, looks pretty good.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: