✅ Verified up-to-date [Last edited: 3/20/2026]


<aside> <img src="/icons/magic-wand_gray.svg" alt="/icons/magic-wand_gray.svg" width="40px" />

Overview (TL;DR)

If you’re prompting for more than one object or character in a scene, and their details are blending together, try this formula:

[set-up a generic scene using keywords] [add details by calling back to those keywords] [describe the rest of the image] [describe the vibe or aesthetics]

Example: Three different best friends sitting close together on a park bench. The friend in the middle is a cheerful blonde Caucasian woman wearing jeans and a green tank-top. The friend on the right is a serious African American man dressed in a tuxedo. The friend on the left is a laughing Indian woman wearing orange Hindi traditional robes. Stylish digital art by Krenz Cushart and Tom Bagshaw.

</aside>

<aside> <img src="/icons/help-alternate_green.svg" alt="/icons/help-alternate_green.svg" width="40px" />

How many subjects (objects or characters) can I get in the same image?

</aside>

Here’s a general rule of thumb for how many subjects you can prompt for successfully, with individual details fairly intact (assuming you follow the suggestions in this FAQ).


<aside> <img src="/icons/light-bulb_green.svg" alt="/icons/light-bulb_green.svg" width="40px" />

There's a special way to prompt multiple subjects…

…otherwise, they blend together.

</aside>

Try this prompt template, explained in the 1️⃣ - 2️⃣ - 3️⃣ - 4️⃣ steps below...

**[set-up a generic scene using keywords] [add details by calling back to those keywords] [describe the rest of the image] [describe the vibe or aesthetics]**

<aside> <img src="/icons/light-bulb_green.svg" alt="/icons/light-bulb_green.svg" width="40px" />

First, let me give you some vocabulary for this…

You’ll need both for this to work.

</aside>

1️⃣ Step #1: Compositional Archetype: [set-up a generic scene using keywords]

For prompts where it makes sense to do so, set up the scene in generic terms using archetypes in the first statement. There’s a sweet-spot for specificity here. It doesn’t have to be very long. You’ll add details in a moment. [Note: This isn't a rule. You don't have to do this. But if what you're doing isn't working, try this. It might help.]

✅ Good: Three friends sitting on a park bench.
✅ Better: Three different friends sitting on a park bench. (Without "different" Midjourney gets to decide their general appearance and they may appear similar.)
✅ Best, get specific: Three different best friends sitting close together on a park bench. (Without “best friends” and "sitting close together" we get a more generic vibe.)