14islands | The art of prompting: An introduction to Midjourney

This is about

Design, Code

Reading time

14 minutes

Published date

17 Oct 2023

This is about

Design, Code

Reading time

14 minutes

Published date

17 Oct 2023

The art of prompting: An introduction to Midjourney

AI generated image of an old lady in colorful clothes looking down at hotel lobby

Markus WallénExperience Designer

In this era defined by rapid technological advancements, we find ourselves amid an AI-driven creative renaissance. The possibilities seem boundless as we explore the realm of generative artificial intelligence, a field that promises to transform every possible way we approach design and development.

At 14islands we aspire to be at the forefront of embracing AI's transformative potential. In a world where pixels meet code, the results of utilizing generative AI in the process can be functional, time-saving and even awe-inspiring. Whether we’re designing a website, creating marketing content, writing sample copy, designing a mockup, creating a visual concept mood board, building a product or envisioning a brand's identity, AI can be a secret weapon for innovation.

In this article, we’ll dive into the visual side of generative AI, which in my view is one of the most exciting fields of AI-powered creativity. We’ll learn how to harness the powers of Midjourney, one of the most prominent tools in the field, for generating interesting, weird and sometimes mind-blowing visuals.

A great deal of my learnings and inspiration comes from the great content from Yubin Ma at AiTuts, where you can learn more about prompting and view a myriad of examples.

Getting started with Midjourney: The basics

Midjourney is an app that is built on top of the messenger app Discord. It lets you generate images based on text input in the form of prompts.

Getting started with Midjourney is free, and you can do so by signing up to join the beta on midjourney.com. Doing this takes you to Discord, where you can either set up a new account or use an existing one. Recently, however, the free version has been out of service for long periods due to high demand, which is why one of the paid versions might be worth considering.

The first thing you’ll notice once you're in is that you're added to a random newbie group. You’ll see lots of other people doing their thing in the chat interface, which can feel quite overwhelming at first. In these channels, everyone generates images in public, so you’ll be able to see what everyone else is up to, and they’ll see what you’re doing. If you want to avoid the frenetic public channels, you can subscribe to a paid plan and get a more private, calmer experience. As a start though, the public channels can be a good source of inspiration as you can see other users’ prompts and the resulting images.

A basic prompt starts by typing /imagine the chat window, followed by whatever you want your image to show. If you forget about the /imagine part, your message will not be interpreted as a prompt, which is essential.

Screenshot of a Prompt /imagine in Midjourney

After entering a prompt, it will be processed by the Midjourney bot, which can take a while. While you’re waiting you can see the progress of your images being generated, and when finished, your result will be ready for you in the form of a grid of four pictures.

Midjourney bot loading with different versions

Images being generated (left), and ready with action buttons (right)

The grid shows four different image options based on your prompt and you can use the buttons below to either upscale U1-4 or create variations V1-4 of your favourite option. The numbered order goes left to right, top to bottom. Upscaling increases the resolution of the chosen option, while the variation button creates four new similar variations based on the chosen option.

Playing around with some prompts and buttons is a great way to learn the essentials. So if you haven’t done so already, get started and generate some images based on your ideas before diving a bit deeper.

Spend some time in Midjourney and you’ll quickly start wondering why people are adding all those unreadable codes. And how do you actually craft the ideal prompt?

The art of prompting

Let’s start the way to building prompts successfully. The basic prompt anatomy with Midjourney is /imagine Prefix Scene Suffix Parameters

Here’s an example of a cheetah on a bike:

a cheetah in fashionable clothes riding a racing bike

Prefix: medium full shot What type of medium do you want? Is it a painting? Digital art? A special type of shot? Unless this is defined, the default Midjourney style is set to photo
Scene: a cheetah in fashionable clothes riding a racing bike This is the main part of the prompt. What do you want to visualise? What is the subject? What are the surroundings? This you can easily describe with normal language.
Suffix: bichromatic, high resolution This is where you can add additional details to your prompt. Are there any modifiers to add quality? Any specific colours? Moods? Add any extra details to consider here, separated by commas ,
Parameters: --ar 3:2 --s 600 Parameters are added to the end of the prompt to define some characteristics of your prompt based on some predefined aspect of an image or style. This is where things get a bit more technical and regular language will no longer do the trick.

Parameters generally let you control a certain aspect of your image on a preset spectrum. Let’s explore some of the most useful parameters to play around with.

The parameter format in Midjourney is --[parameter] [value] Multiple parameters can be added after each other to control different aspects of your image.

Aspect Ratio

--ar W:H

The aspect ratio is an important basic parameter that lets you control the proportions of the image as you wish. Use standard ratios or enter any numbers you want for more custom sizes.

Cheetahs with sunglasses and fashionable clothes

Cheetah with sunglasses and fashionable clothes on a racing bike

Stylize

--s [0-1000] or --stylize [0-1000]

The stylize parameter determines how much the Midjourney bot’s training data should influence the image. Midjourney has been trained on images that favour artistic colour, composition, and forms. Low stylization values produce images that closely match the prompt but are less artistic. High stylization values create images that are very artistic but less connected to the prompt.

Lower stylization values result in highly prompt-accurate yet less artistic images, while higher stylization values give you in highly artistic images that may be less connected to your prompt.

Let’s try some different stylization settings for the following prompt:

/imagine cheetah in fashionable clothes riding a racing bike

/imagine cheetah with sunglasses in fashionable clothes riding a racing bike

A cheetah in fashionable clothes riding a racing bike, wearing sunglasses

A cheetah in fashionable clothes riding a racing bike with blue sky

Here are my findings of various stylization values:

Very low value: --s 0 Generally less impressive results, closer to what you might be used to seeing on earlier versions of Midjourney or Dall-E.
Medium/low value: —-s 100 more artsy and dreamy vibe.
High value: --s 1000 more sharp and realistic. The subject becomes more prominent in the picture (since the model seems to be trained on more portrait photos) Subject details becomes more prominent while background details tend to blur and fade away.

To summarize, a higher stylization value results in an overall increase in realism and becomes more portrait-like since Midjourney is trained on highly realistic photos. A number that I’ve found good personally is --s 750, which I've added to my default prompt settings so that it’s added to any prompt I write unless I override it with another value.

Chaos

--c [0-100] or --chaos [0-100]

Chaos determines how varied and unexpected your results and compositions will be. A higher value gives Midjourney more freedom to explore more variation in the suggestions you get. The default value is low, set to 0.

photograph of two cheetahs in fashionable clothing riding a tandem bike

Minimal variation: Four similar pictures of cheetahs on a bike

Maximal variation: Four different pictures of cheetah-looking animals

Weird

--weird [0-1000]

The weird parameter is new to Midjourney and makes things, well, kind of weird. But used sparingly with lower values, it can actually make some things like people’s faces a bit more realistic and random, which can be refreshing as opposed to the otherwise super-perfect looking models you’d normally get. Any value above 500 doesn’t really add that much to the weirdness though, so to keep it to a “good” --weird aim for about 250 to 500.

oil painting of potato surrounded by plants

AI-generated image of a weird potato in an antic painting style

AI-generated image of weird potatoes in a field

Permutation prompts

{one thing, another thing}

Do you want to try out multiple different details of the same general prompt, but don’t want to write the same thing over and over again? Permutation prompts can help you produce multiple variations using just a single /imagine command.

Simply put curly brackets {} around your concepts, separated by commas, and Midjourney will produce one generation for each word.

The following prompt:

8-bit pixel art, cozy cafe, {winter, spring, summer, autumn} outside

produces four generations (with 4 options in each to pick from: 16 images in total), one for each word within the curly brackets. The rest of the prompt stays the same for all variations.

8-bit pixel art, cozy cafe, winter outside

8-bit pixel art, cozy cafe, spring outside

8-bit pixel art, cozy cafe, summer outside

8-bit pixel art, cozy cafe, autumn outside

Sometimes, one of the words you wrote was more or less ignored or forgotten in the result. To tackle this, there are ways to give it some extra hierarchy.

Repetition

Repeating the same word within a prompt, or similar phrases can cause the model to emphasize that word in the generated image. This can be a simple way to use natural language to let Midjourney know that one word is really really really really important.

AI-generated castles with long stairs, bridge and mountains in the background

AI-generated very large castle with a river and cloudy sky

Weighted Terms

::[weight]

A more controlled way of creating hierarchy is by using weighted terms. Prompt weights can be added to any of your words to emphasize or de-emphasize that word’s influence on the result.
The dividers :: used for prompt weights are related to commas. A comma , in a prompt is interpreted as a soft break, whereas a divider :: reads as a hard break. In essence, this means that a comma between two phrases means "these are two different concepts" while a :: divider means that "these are very different concepts".

If you add a number after a divider, such a as ::2, it will give emphasis to the section preceding the divider. For example, cool::2 means twice as cool as just cool.
You can also add a negative weight, such as ::-1 for things that you don't want.
All words have a default weight of 1 (but words at the start of a prompt have a greater effect on the result than words at the end)

AI-generated image of hippopotamus in water wearing a green jacket and a scarf

AI-generated image of hippopotamus on grey background, studio light, wearing fashionable jackets and flowers

AI-generated image of fashion characters wearing flowers and costumes

Quality boosters

Instead of prompting Midjourney to give you a “very very very very very very beautiful image”, it’s easier to use some tried and tested quality booster terms.

Adding specific words can significantly impact the quality of your generated images. The reason these words can provide you with a better result is simply because people uploading images online tend to add certain terms in their descriptions. Since the internet is Midjourey’s source for learning how to match words with image properties, any words added to your prompt influence the images that are referenced.

The following terms won’t give you a higher resolution image, but you can add one or more of these words as a suffix to potentially get some crisper results.

High resolution, 2K, 4K, 8K, clear, good lighting, detailed, extremely detailed, sharp focus, intricate, beautiful, realistic+++, high quality, hyper-detailed, masterpiece, best quality, art station, stunning

To control amount of colours in your images you can also try: monochromatic, bichromatic, trichromatic, complementary colors

In addition to different parameters, there are numerous techniques and styles for enhanced control. Let's explore a few!

Shot types

Different types of shots can be used to control the kind of photo you want. Here are some examples of different photography shot styles.

a woman, surrounded by a hectic urban environment during midday, a film still directed by Wes Anderson

Two women, surrounded by a hectic urban environment during midday, gold, orange and yellow tones

Extreme close-up, (left), Close-up (right)

Women, surrounded by a hectic urban environment, wearing retro clothes

Medium full shot (left), Full-body shot (right)

Stock photos

While hiring a professional photographer to capture your authentic photos remains the best and most ethical option, if resources are limited, Midjourney can be a quick and easy copyright-safe(ish?) alternative to stock photos, providing generic mood-setting images.

Drone shot photograph looking down at buildings in New York City, morning, hyperrealistic

Side angle shot of a man in a cafe reading a book, wearing sunglasses

Editorial Photography

“Prize-winning photos win prizes not because they're pretty, but because they tell stories about the human experience.” The following prompt, taken directly from aituts.com, yields a truly mindblowing outcome.

photograph from 2018s China: a young couple in their 20s, dressed in white, stands in their home, displaying a range of emotions including laughter and tears. Behind them is a backdrop of a cluttered living space filled with white plastic trash bags and torn white paper rolls. Captured with a film camera, Fujifilm, and Kodak rolls, the image conveys a strong cinematic and grainy texture. This artwork uniquely documents the complex emotions and living conditions faced by the young people of that era. --ar 4:3

Photograph from 2018s China showing a young couple in their 20s, dressed in white, stands in their home filled with white plastic trash bags and torn white paper rolls.

By using this prompt as a template and varying the details, you’re bound to get highly impressive photos with lots of emotion.

photograph from 1990s Scotland: an old bartender and his wife in their 60s, stands outside their abandoned pub, displaying a range of emotions including sadness and tears. It's raining and the sky is grey. Behind them is a backdrop of a small, empty, remote village where everyone has moved out. Captured with a film camera, Fujifilm, and Kodak rolls, the image conveys a strong cinematic and grainy texture. This artwork uniquely documents the complex emotions and living conditions faced by people being left behind. --ar 4:3 --s 700

Photograph from 1990s Scotland showing an old bartender and his wife in their 60s

To conclude our learnings, let's dive into an example that takes us through the iterative thought process of generating a new image.

Midjourney has introduced some new features that simplify the process of enhancing your generated images and retouching undesirable ones you don’t like. By expanding your image borders and allowing Midjourney to envision what lies beyond the frame, you can even generate higher-resolution pictures than what's normally achievable.

In this example I started by imagining an old woman in colorful clothes looking down at hotel lobby from a balcony, a film still directed by Wes Anderson --s 750 --ar 16:9

Old woman in colorful clothes looking down at colorful hotel lobby

Zooming

To zoom out, simply click on one of the zoom buttons located beneath your upscaled image. The standard options provide zoom levels of 1.5x or 2x. By going for a custom zoom instead, you gain complete control over both the scale and the content of the surrounding area. In this case, I felt like adding a couple of weird plants around the woman.

Remove the old prompt to add something new

Screenshot from Midjourney to show how to zoom out, add plants and more details

Let’s see what it looks like with plants

AI-Generated image of an old lady in a colourful room full of plants

The result looks interesting. But now I feel like the armchairs and some details on the roof look a bit off.

Vary region

To regenerate an area that you don’t feel happy with, you can click the “🖌️ Vary (Region)” button and easily highlight what should be reimagined.

Screenshot from Midjourney showing how to increase background around a subject

The highlighted areas will be regenerated while the rest will remain

An old lady looking into a colourful room, surrounded by plants

The result looks better I think, apart from the roof still being glitchy it's an improvement

Panning

From the chosen image I wanted to create an ultra-wide version, without losing resolution. To do this you can use the “Pan” buttons in your chosen direction. This extends the image with a new frame next to it while maintaining the resolution of your original. Doing this twice, for each direction keeps the subject centered in the picture.

Panning left, followed by panning right on the result

I like how the final result turned out even though it’s quite different from the initial photo. The iterative process might not get you exactly what you had in mind at first, but can on the other hand take you places you would never had imagined!

Ultra wide image of old lady in a jungle inside a big and colourful hotel lobby

Conclusion

AI-generated images are getting better than ever. Rapidly. And they’re here to stay whether we like it or not. Upcoming updates of Midjourney and DallE 3 reportedly being able to handle accurate text in images potentially reduces the need for manual retouching. The speed of production and quality of AI-generated visuals raises intriguing questions about the future of human-generated content, and which jobs will remain relevant?

Looking ahead, the “content flippening” — an event horizon when the majority of content we consume is primarily AI-generated rather than primarily human-produced — is starting to feel unavoidable.

Graph from Latent Space Podcast: Heralds of the AI Content Flippening

As AI takes on an increasingly central role in content creation, we need to ask ourselves about its impact on our lives and well-being. Do we want to participate in this transformation? How does it affect diversity? How does it address copyright concerns? Ownership becomes a blurred concept. Can we rely on the authenticity of online images? And how do we know if something was written, or even proofread by a human?

While the progress in generative AI is undeniably exciting, it also brings an element of unease. I think it's beneficial to educate oneself with some of the available generative AI tools, to be aware of their capabilities and to know what to expect. However, although many AI enthusiasts claim that these tools can make life a breeze by solving all your, there's, at least in the current landscape, still quite a learning curve in the way of effortlessly getting to your desired results.

Markus WallénExperience Designer

Droplets

Sign-up to get the latest insight from 14islands. We send a newsletter only once every quarter with inspirational links and creative news. It's short and sweet. You can unsubscribe it at any time.

The art of prompting: An introduction to Midjourney

Getting started with Midjourney: The basics

Spend some time in Midjourney and you’ll quickly start wondering why people are adding all those unreadable codes. And how do you actually craft the ideal prompt?

The art of prompting

Parameters generally let you control a certain aspect of your image on a preset spectrum. Let’s explore some of the most useful parameters to play around with.

Aspect Ratio

Stylize

Chaos

Weird

Permutation prompts

Sometimes, one of the words you wrote was more or less ignored or forgotten in the result. To tackle this, there are ways to give it some extra hierarchy.

Repetition

Weighted Terms

Quality boosters

In addition to different parameters, there are numerous techniques and styles for enhanced control. Let's explore a few!

Shot types

Stock photos

Editorial Photography

To conclude our learnings, let's dive into an example that takes us through the iterative thought process of generating a new image.

Zooming

Vary region

Panning

Conclusion

Droplets

Droplets

Extending realities for design & development

How design is the key to a more inclusive Web3

3 tips to Sustainable and Ethical Human-Centered Design