Pure text->image is impossible to get exactly, given there's 10000 possibilities for a dog. Even if text prompts eliminate 99.9% of probabilities, it there's still 10 possible images.
However, with stuff like controlnet, it's already possible, and will be solved within a year. Yes you can specify every exact detail, but you need to feed it a sketch, or a skeletal pose, or a reference image of the dog...
Also, you can train a LORA on the subject before hand, if you want to consistently regenerate the subject, with just text.
However, with stuff like controlnet, it's already possible, and will be solved within a year. Yes you can specify every exact detail, but you need to feed it a sketch, or a skeletal pose, or a reference image of the dog...
Also, you can train a LORA on the subject before hand, if you want to consistently regenerate the subject, with just text.