The key element missing from the generated images is understanding of form. In r...

norhi999 · on Sept 6, 2022

Those art ML models indeed operate on wrong premise that the input and output images are entirely raster fields, but most of them should actually be considered curve fields with the curves internally extrapolated to complete color or texture filled 3D-shapes by what's known as gestalt principles*, volume estimation from shading etc. What should be raster is only filling textures.

The current approach creates huge limitation of input/output images being like 512x512 small and a whole load of texture-turning-to-shape and vice versa artifacts.

It could be possibly overcome with a paradigm shift, though.

* https://encyclopedia.pub/entry/history/show/399

* http://www.scholarpedia.org/article/Gestalt_principles