Yeah and what I'm saying is that this rapidly rising "skill" is just nonsense. T...

DrSiemer · on Aug 27, 2022

No, it is not. Making these models generate interesting images requires you to learn something that is almost similar to a new kind of language.

As soon as you start to automate this process, for example by adding some default attributes like "4k, 8k, hd" to every prompt, you introduce a huge amount of bias to the output and lose the freedom to get anything outside of those specifiers.

Sure, future iterations will have a better understanding of language input. But knowing exactly how to phrase your prompts will always be a skill that requires eloquent writing to get to the more interesting and appropriate results.

In part that's because using more esoteric language will automatically connect you to a specific subselection of the source material, that was described using those more uncommon words in the training of the model. Having an extensive vocabulary and knowing how to wield it is actually a huge boon in this particular field.

"Unreal Engine 5" is just a quick shortcut to output that is detailed, clean, often futuristic and usually looks impressive. But you can go a lot further, for example by manually subtracting weights. Teasing MidJourney with this prompt was entertaining:

clear view of a dense forest::5 plants::-.5 tree::-.5 trees::-.5 foliage::-.5 leaves::-.5 shrubs::-.5 bushes::-.5 blur::-.5 mist::-.5 winter::-.5

Btw, is anybody working on a "language florifier" model yet? I imagine writers would be interested. "Rewrite this story with more emotion and in the style of Kurt Vonnegut, cyberpunk".

spywaregorilla · on Aug 28, 2022

Yes, it is stupid. Adding 4k to every prompt introduces bias. Yes. That doesn't mean learning the ins and outs of each phrases bias is a reasonable idea. It's also not guaranteed to be a constant effect. Its great that you can become more skilled at prompts, that doesn't make it a good interaction model. The interface is a tool and tools are important. That there are people who are great at typewriters doesn't mean they're all that reasonable in the age of computers and word processors.

> But you can go a lot further, for example by manually subtracting weights. Teasing MidJourney with this prompt was entertaining:

This is an example of an improvement from basic prompts. It's still far from a good model. "Guess and check" is basically the worst UX one can create for a design process.

One should be able to specify content separately from style, and layer in stylistic choices in a clear hierarchy. Text is a good model for specifying content. It's a pretty shitty way to specify style. Style is something we could likely convey visually and with pallet reference points.