Just go to the link and click on "Start Creating". No signing in required.
I built shortbread to help anyone to create comics / manga series. The onboarding process helps you kick start a page from 60%, then you can use your creativity to bring it to 1000% in a fully-controllable editor.
Tech stack:
GPT 3.5 Turbo - the comic script generation. It handled everything from layout, character, scene, SD prompts, to dialogue.
SD 1.5 - We put up SD servers on GCP. For every comic we generate one large image and crop it into panels. Per the experiments of u/Deathmarkedadc on Reddit, this massively helps with consistency. The models are trained on anime scenes tho, and might not be so great with animals.
Frontend: Next.js 13 on Vercel, React + Typescript. We built the entire editor from scratch to compose the comic (images, panels, speech bubbles, text) like a webpage. This allows you to edit and republish your comics like a website.
You can dynamically generate panels as well. Try resizing a panel into a long narrow box and generate.
Backend: Firebase.
Sample comics:
a japanese couple sits at dinner table. The husband told the wife a secret (link https://create.shortbread.ai/viewer/debdf25c-3f95-492a-952a-...)
An army of male soldiers fighting against an army of female soldiers in ancient china (https://create.shortbread.ai/viewer/4566613c-7146-4ed7-9b8d-...)
a team of girls play volleyball against a team of boys (https://create.shortbread.ai/viewer/aafc2f61-d008-4f3f-aa8f-... )
Next steps:
- More pages
- Fine panel-level control. Poses, control net, etc.
- Multi-character.
- Different styles.
- Allows you to control character design.
I’m Fengjiao Peng, founder and chief engineer at Shortbread. I was previously a webtoon artist. We want to build this into something you can create entire comics series / manga / webtoons with. Criticism and suggestions welcome!
Some random suggestions:
- I dunno what diffusion framework you are using, but the AITemplate (for GPUS) or diffusers JAX (for TPUs) backend can massively increase your diffusion throughput.
- Alternatively, I believe HuggingFace already has a JAX backend for Stable Diffusion XL, so you could run a model with much better support for large resolutions/inpainting massive images at a similar (?) speed.
- There are schemes for area prompting and subject "subset" prompting in stable diffusion, as well as using images as input. As an example of how y'all might use this, you could generate a image for Character A, an image for Character B, encode them. specify that the character A prompt latents go on the left side of the image, and the character B prompt latents go on the right side of the image. And of course you can add to these area prompts, like "jumping" on the left side and "ducking" on the right side of the image. There's also a way to specify which prompts/encoded images belong to which subjects instead of manually cutting out areas, see: https://github.com/BlenderNeko/ComfyUI_Cutoff