✅ Verified up-to-date [Last edited: 3/20/2026]
<aside> <img src="/icons/magic-wand_gray.svg" alt="/icons/magic-wand_gray.svg" width="40px" />
If you’re prompting for more than one object or character in a scene, and their details are blending together, try this formula:
[set-up a generic scene using keywords] [add details by calling back to those keywords] [describe the rest of the image] [describe the vibe or aesthetics]
Example: Three different best friends sitting close together on a park bench. The friend in the middle is a cheerful blonde Caucasian woman wearing jeans and a green tank-top. The friend on the right is a serious African American man dressed in a tuxedo. The friend on the left is a laughing Indian woman wearing orange Hindi traditional robes. Stylish digital art by Krenz Cushart and Tom Bagshaw.
</aside>
<aside> <img src="/icons/help-alternate_green.svg" alt="/icons/help-alternate_green.svg" width="40px" />
</aside>
Here’s a general rule of thumb for how many subjects you can prompt for successfully, with individual details fairly intact (assuming you follow the suggestions in this FAQ).
<aside> <img src="/icons/light-bulb_green.svg" alt="/icons/light-bulb_green.svg" width="40px" />
…otherwise, they blend together.
</aside>
Try this prompt template, explained in the 1️⃣ - 2️⃣ - 3️⃣ - 4️⃣ steps below...
**[set-up a generic scene using keywords] [add details by calling back to those keywords] [describe the rest of the image] [describe the vibe or aesthetics]**
<aside> <img src="/icons/light-bulb_green.svg" alt="/icons/light-bulb_green.svg" width="40px" />
First, let me give you some vocabulary for this…
[set-up a generic scene using keywords] - This is called setting up the compositional archetype.[add details by calling back to those keywords] - This is called lexical anchoring.You’ll need both for this to work.
</aside>
For prompts where it makes sense to do so, set up the scene in generic terms using archetypes in the first statement. There’s a sweet-spot for specificity here. It doesn’t have to be very long. You’ll add details in a moment. [Note: This isn't a rule. You don't have to do this. But if what you're doing isn't working, try this. It might help.]
| ✅ Good: | Three friends sitting on a park bench. |
|
|---|---|---|
| ✅ Better: | Three different friends sitting on a park bench. |
(Without "different" Midjourney gets to decide their general appearance and they may appear similar.) |
| ✅ Best, get specific: | Three different best friends sitting close together on a park bench. |
(Without “best friends” and "sitting close together" we get a more generic vibe.) |