Member-only story
Creating superior AI generated art through better prompting
An introduction to image prompt engineering in Stable Diffusion
Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model, such as Stable Diffusion. Think of it as the language you need to speak in order to tell an AI model what to draw. By improving these instructions, we can achieve more specific stylistic outcomes.
While image prompting is still a developing art, today’s post will explore how to craft better text prompts to create spectacular images in Stable Diffusion.
Tokenization in Stable Diffusion
Before we get into writing prompts, it’s worthwhile to build a very high level understanding of how a text prompt can be broken down in a way that our machines can understand and use.
Stable Diffusion is efficient because its diffusion model transforms a text or image prompt from pixel space into latent space. This is significant because latent space allows us to represent abstract ideas in compressed mathematical dimensions and greatly reduces the memory and compute requirements compared to pixel-space diffusion models (DALL-E).