Joe Regenstein, CPA, FPAC

Using Generative AI To Create Animal Aviators

Animal Aviators I’m excited to walk you through my latest project, “Animal Aviators,” where the charm of nature meets the rugged flair of aviation, all brought to life through the power of generative AI.

This project started with a simple goal of creating a unique profile picture for a website instead of putting my mug on there. But as I delved deeper into the world of generative art and AI technologies like MidJourney, what began as a single profile picture morphed into a series of captivating digital stickers, each featuring a hip animal decked out in aviation or military-themed gear against a backdrop that complements a scenic habitat. From a hip lion with a flowing mane and aviator sunglasses wearing a leather bomber jacket in the savannah to a cool polar bear in the Arctic, these digital tools made it possible to create a fun set of animals.

In this blog post, I’ll share the step-by-step process behind creating these animals, with insights into the technologies that made it possible. I’ll also share some essential tips for blending creativity with cutting-edge tech. There is a link at the end if you want to see all of the aviators. Ready to take flight?


Generative AI

By now, you’re probably tired of hearing about ChatGPT and how these large language models will transform work as we know it. Generative AI is a type of artificial intelligence that can create new, original content, such as images or text, based on the data it’s been trained on. Programs such as MidJourney, Stable Diffusion, and DALL-E are advanced machine-learning models that generate new images. These models use neural networks to understand and mimic various styles, patterns, and forms in the data. Users can input a prompt to guide the AI in generating specific outputs, such as creating images, designing characters, or even forming complete scenes. There are infinite parameters that can be provided in plain language to describe the subject, art style, lighting, color schemes, and point of view. To learn more about prompt engineering and examples, please look at OpenArt’s Prompt Book. Some platforms can edit or add to a user-provided image. These technologies are revolutionizing digital art, content creation, and even e-commerce by offering unprecedented creative capabilities that can be fine-tuned and customized for various applications.


The Creative Spark

I wanted to create a profile image I could use on various websites without plastering my image everywhere. I used Midjourney to create a header for my website inspired by our trip to Yellowstone and the surrounding areas. I’ve also wasted many an afternoon taking various prompts from other users and mixing them to see what came out.

Midjourney is accessed through Discord, a productivity tool similar to Slack, where you can connect to the Midjourney server and chat with the Midjourney bot. There are many commands to interact with the bot, but typing /imagine followed by a prompt sends the request to the server, and the bot begins creating images. Specific parameters can be used to set the aspect ratio (by default, images are square), how much creativity the bot should use, or the quality of the image. More commands and parameters are available in the Midjourney documentation. By default, Midjourney creates four images on a grid and then allows the user to resubmit the prompt, request variations of a specific image, or upscale an image so it can be used. This won’t be a Midjourney tutorial specifically; here is a great primer to learn more about Midjourney.

/imagine logo to be used as a profile picture. Hip tiger with dark aviator sunglasses looking off into the distance photo realistic.

Users can upscale or request variations of an image.

I was able to request variations on the image I liked by selecting V3 and getting a new slate of images based on the selected image.

Variations can be selected for multiple images.

I ultimately upscaled the bottom left image using U3, which created an image I used as a profile picture.

Once upscaled, there are further options.

There are additional options to get further variations or expand an image in various directions. These adjustments probably work better with scenic images where the AI can add more to the image. In the Midjourney documents, there is more detail on what can be done with an image.

It wasn’t until the following weekend I decided to create additional animals. I needed to find the seed for the original job to ensure they had a similar aesthetic. To provide this to the bot, the seed parameter needs to be added to the end of the prompt, e.g. —seed 12345 (There are two dashes). Using the same seed allows continuity between requests. To get the job details, you can react to the image with the envelope emoji ✉️, and the details are sent via direct message. I tried using the same prompt for the lion with additional instructions for the clothes and background but couldn’t reproduce the border style around the image despite using the same seed.

/imagine Logo to be used as a profile picture with a white background with the headshot centered. A headshot of a hip lion with a brown leather bomber jacket in dark aviator-style sunglasses looking off to the side, slightly into the distance, with a white background. Photo realistic

Great images but no border.

After several failed attempts to get the border, I took my favorite image and tried reproducing the border myself. I used Pixelmator, but any photo editing app, such as Adobe PhotoShop, could be used. I needed to select the image’s subject to remove the background to make way for the scenic habitat. This was tricky; the quick select tool did really well when the background was solid, or there was a decent outline around the animal. Patience was required when working with fine detail, such as the fur. Let me just say there was a lot of ctrl+z to undo mistakes.

Select tools allow you to keep only the subject.

I used the same seed to create the background to maintain the same look and feel as the animals.

/imagine Photo realistic scenic landscape of savanna with mountains and trees in the distance

Upscaled image 2.

You can see it coming together when layering the animal on top of the background.

Two layers, the background and animal.

Now I have a new problem: how do I take these items and get them to look similar to the tiger image with the ellipse with the detailed border? I needed to create a clipping mask that could be used repeatedly. The mask works by letting the image through the light portion and blocking anything dark. This is best accomplished with shapes that are 100% black or white. If other colors are used, it will let some of the image through, providing a fading or ghost effect. Using the tiger as a guide, I created a white ellipse covering the image’s main portion and a circle shape for the border. The AI-generated image wasn’t a perfect circle, so it took some adjustments to get the same slightly oblong shape. In a separate file, copied the white shapes onto a black background and saved it for future use. This saved a lot of time; I could move or resize the image and get a refined look without deleting anything. And each image came out the exact same shape and size. More on clipping masks can be found in the Pixelmator Pro documentation. Then, utilizing Pixelmator’s clipping mask function, I applied it to both the background and animal (the mask gets applied to one layer, and we have two layers, the background and the animal). To make the lion’s mane appear to flow over the border, I edited the clipping mask attached to the animal layer by erasing some of the black borders. This gives the image some depth and makes it appear as if it was coming out of the image.

When attached to a layer, the portion of the image in the white section is opaque, and the black is hidden.

Anything over the black section is hidden, including some of the lion’s mane.

Editing the mask allows the mane to flow over the border.

With this workflow, I could pick any animal, choose an aesthetic such as a flight suit and sunglasses, and a separate background based on the animal’s habitat. The tricky part was getting enough of the animal to fill the ellipse. In some cases, I needed to use the cloning tool provided by Pixelmator to extend the clothing and blend it in.

Like any generative AI, there are errors called hallucinations in the industry. In ChatGPT, a hallucination could be a source citation that doesn’t exist or simply making up something inaccurate. In the animal pictures, the hallucinations were largely sunglasses or goggles not covering the eyes or strange anatomy, such as an elephant trunk with a tusk growing out of it. To get around this, I just requested variations and tweaked the prompt to provide more specific instructions.

Elephant with misaligned goggles and a tusk in the trunk.

For some animals, it was necessary to use touch-up tools to clean up random artifacts, but these were minor edits, such as removing the dark line on the beak.

Dark line on the beak.

Clone tool painted over the line.

It is hard to tell, but the gray portion of the image is transparent, indicated by a gray checkerboard. When exporting the image as a PNG that is preserved. Here, we can see how it looks on a white background.

Raw image

How it would appear on a website with a white background.

There were some minor annoyances. For some reason, I couldn’t get the Wolf or African Painted Dog to look to the side despite several editions to the prompt. They look like driver’s license pictures, but they still turned out great. Sometimes, the animal looked to the left, but this was easy to flip in the image editor. The rhino started looking more like a cross between a kangaroo and a camel, but fortunately, with more variations of the prompt, it worked out.

Original result.

Final variation.

Animal Aviators was a fun project all brought to life by generative AI technologies like Midjourney. Originally aiming to create a unique profile picture, the project evolved into a series featuring various animals in an aviation-inspired look. The experience has demonstrated the possibilities that generative AI opens up and the challenges of getting from an idea to a finished, polished piece of art.


If you want to see the Animal Aviators, they are viewable on my Son’s Etsy shop. We set it up to sell reprints of his watercolor paintings. Let me know which one is your favorite.


Midjourney: https://www.midjourney.com/

PixelmatorPro: https://www.pixelmator.com/pro/

Animal Aviators: https://www.etsy.com/listing/1548833853/animal-aviator-digital-stickers


Prompts (they evolve from the first to the last):

Tiger

/imagine logo to be used as a profile picture. Hip tiger with dark aviator sunglasses looking off into the distance photo realistic

Lion

/imagine Logo to be used as a profile picture with a white background with only a circle of the image exposed. A headshot of a hip lion in dark aviator-style sunglasses looking off into the distance with white background. Photo realistic

Panda

/imagine Logo to be used as a profile picture with a white background. A hip black and white panda bear in dark aviator-style sunglasses looking off into the distance slightly to the side. Photo realistic

Elephant

/imagine Logo to be used as a profile picture with a white background with the headshot centered. A headshot of a hip elephant with dark-rimmed aviator-style sunglasses with silver lenses wearing an gray flightsuit. The subject is looking off to side and up slightly as if looking over the horizon. The background should be white and only show the subject. Photo realistic

Gorilla

/imagineLogo to be used as a profile picture with a white background with the headshot centered. A headshot of a hip silverback gorilla dark-rimmed aviator-style sunglasses with silver lenses wearing an olive drab jumpsuit. The gorilla is looking off to the right side and up slightly as if looking over the horizon. The background should be white and only show the gorilla. Photo realistic

Grizzly Bear

/imagine Logo to be used as a profile picture with a white background with the headshot centered. A headshot of a hip grizzly bear with dark-rimmed aviator-style sunglasses with reflective lenses wearing an flightsuit. The subject is looking off to side and up slightly as if looking over the horizon. The background should be white and only show the subject. Photo realistic

Polar Bear

/imagine Logo to be used with a white background, centered. A headshot of a hip polar bear with dark-rimmed round sunglasses with black lenses wearing military snow gear. The subject looks off to the side and up slightly as if looking over the horizon. Photo realistic

Ferret (for my daughter)

/imagine Logo to be used with a white background, centered. A headshot of a hip ferret with dark-rimmed round sunglasses with black lenses wearing brown camo gear. The subject looks off to the side and up slightly as if looking over the horizon. Photo realistic

Rhino

/imagine Logo to be used with a white background, centered. A headshot of a menacing hip black rhinoceros with a horn. Dark snow goggles with black lenses wearing brown camo gear. The subject looks off to the side and up slightly as if looking over the horizon. Photo realistic

Red Panda

/imagine Logo to be used with a white background, centered. Rounded. A headshot of a hip red panda. Aviator sunglasses with black lenses wearing green camouflage gear. The subject looks off to the side and up slightly as if looking over the horizon. Photo realistic

Cheetah

/imagine Logo to be used with a white background, centered. The upper body of a hip cool cheetah. Aviator sunglasses with black lenses wearing a pilot jumpsuit. The subject looks off to the side and up slightly as if looking over the horizon. No background. Photo realistic

African Painted Dog

/imagine Logo to be used with a white background, centered. The upper body of a hip cool African painted dog. Aviator sunglasses with black lenses wearing a pilot jumpsuit. The subject looks off to the side and up slightly as if looking over the horizon. No background. Photo realistic

Wolf

/imagine Logo to be used with a white background, centered. The upper body of a hip cool grey wolf. Aviator sunglasses with black lenses wearing a pilot jumpsuit. The subject looks off to the side and up slightly as if looking over the horizon. No background. Photo realistic

Fox

/imagine Logo to be used with a white background, centered. The upper body of a hip cool red fox Aviator sunglasses with black lenses wearing a pilot jumpsuit. The subject looks off to the side and up slightly as if looking over the horizon. No background. Photo realistic

Giraffe

/imagine Logo to be used with a white background, centered. The upper body of a hip cool giraffe with a white aviator scarf blowing in the wind. Aviator sunglasses with black lenses with a bomber jacket. The subject looks off to the side and up slightly as if looking over the horizon. No background. Photo realistic

Bald Eagle

/imagine Logo to be used with a white background, centered. The upper body of a hip cool bald eagle with a white aviator scarf around its neck. Round sunglasses with reflective gold lenses with a bomber jacket. The subject looks off to the side and up slightly as if looking over the horizon. No background. Photo realistic

View original

#AI #Generative AI #Midjourney