A FAQ Guide on Custom AI Art Models

Have you recently gotten sucked down the rabbit hole of AI art generation? Perhaps you heard that Stable Diffusion is capable of being custom trained but you didn’t understand how. Maybe you’ve seen people talking about different models of Stable Diffusion but didn’t know what they meant.

In this guide, I’ll teach you what a Stable Diffusion model is, how they are made, and where to find them. Don’t forget your pocket watch, because we’re going to Wonderland.

Quick Disclaimer

In this guide, I will try to explain Stable Diffusion for beginners who have little to no knowledge about Machine Learning. To do that, I will use some analogies and simplifications. These descriptions are meant to allow non-data-scientist people to understand the concepts, but they won’t necessarily be technically-accurate. This is not meant as a scientific explanation for how Latent Diffusion models or PyTorch libraries work on a technical level.

How Does Stable Diffusion Work?

In simple terms, Stable Diffusion is a software that was trained to scramble and then accurately un-scramble images. By doing this, the computer doing the training learned how to make images on it’s own. That information it learned was then compiled into a file that is, essentially, instructions on how to do it again.

What is a Model in Stable Diffusion?

A model is therefore a file that tells Stable Diffusion how to make images using the knowledge previously learned by another machine. The model contains the results from a previous training session done on a specific set of data (in this case, the data was a collection of images). By loading a model into Stable Diffusion, it allows the program to use that pre-trained knowledge.

Stable Diffusion on it’s own cannot create images. It’s a generator, but it needs to access the model (any model, really) so it knows what to make.

It’s kind of like a handheld GameBoy. The GameBoy doesn’t have any games on it’s own. Instead, you plug in a cartridge that contains a game in order to access it and play. In this situation, Stable Diffusion is like the GameBoy, and the models are like different games you can plug in.

*a gameboy, on a table, detailed, realistic, 35mm lens*

So how do you make your own models for Stable Diffusion?

What is a Stable Diffusion Checkpoint or CKPT File?

A checkpoint file is just the technical name for a model file. The file extension for these are CKPT. So for our purposes, a checkpoint means the same thing as a model. Not all models come in a CKPT file format, but we’ll get to that in the section called “Warnings Before You Download Any Models”.

How are Stable Diffusion Models Made?

Models are made by training a computer on a set of images, forcing it to learn what the images look like and how to replicate or emulate them.

The base models were trained on bil lions of images that were crawled from the internet. It took 150,000 GPU hours (really, 24 days on 256 GPUs) and $600,000 to do it.

Thankfully, you won’t need that much time or money to make your own custom Stable Diffusion model. That’s because we don’t have to train our own models completely from scratch. Instead, user-made models are trained on top of the base version.

The TL;DR Version of Model Training

In short, you would do the following to train a new model for Stable Diffusion:

Collect a bunch of images with a common theme (an artist’s style, a subject, an art medium, etc)
Add them to a model that already exists (usually, CompVis’s base models like V1.4)
Train that current model to also learn from your set of images and slightly modify what it already knows
Save that training as a new model checkpoint file

This process is sometimes referred to as “fine tuning”, because you are tweaking a model that already exists rather than restarting the whole learning process from the ground up.

So, in this way, you can also think of models like being video game mods. With computer games, users can create their own custom content (like new character designs or levels) and add it on top of the base game. Likewise, Stable Diffusion users can create their own modified versions of the base dataset. Those modded version can then be loaded into Stable Diffusion and run just like the base model.

Where Can I Find Models?

Now we’ve gotten to the fun part. Anyone who makes a custom model can share it on the internet. And many of them do? Some people will charge for access to their models. But many users offer them for free.

And there is a range of websites dedicated just to hosting these massive collections of user-made models! The most popular online repositories are:

Huggingface (this site is operated by the creators of Stable Diffusion, a company called Stability AI)
Civitai (this is a user-submission site where people can share models and review those made by others)

I personally prefer getting models on CivitAI, because their site has an excellent tagging system and browsing features. That comes in handy if you don’t know exactly what model you want.

Warnings Before You Download Any Models!

But before you go downloading a hundred different models to try out, take heed! There are some warning you need to be aware of:

Model files can be very large in size, anywhere from 2 GB to 8 GB. You could easily fill up a hard drive with just model files. If your computer is low on disk space, then go easy on the downloads…or buy an extra hard drive to store them on.
Checkpoint files can be dangerous, because they can execute code on your computer. This means it’s possible for an intentionally malicious model to install malware on your computer when you run it. But there is a simple way to avoid this issue: always download the SafeTensors format of the model, rather than the Pickle Tensor. For more information, see my guide on Pickle Safety.
You need Stable Diffusion on your computer to use custom models. Models cannot be loaded into any online service (that I know of, at least). If you haven’t installed Stable Diffusion on your computer yet, check out my guide here.

What are the Best Models to Use?

This subject can be subjective. We all have different reasons for using Stable Diffusion and we want to make different kinds of images. So these recommendations are based on a mixture of my experiences and reviews from other users.

Rather than add a massive list of models onto this article, I instead have curated links to several model lists I have already posted.

Most Popular/Downloaded Models
The Best Models for Photo-realistic Images and Faces

Conclusion

Try not to go overboard when browsing for custom models. It’s easy to feel like a kid in a candy store and just go wild with downloading. But do feel free to test as many as you need until you find the style of output that you’re looking for. And have fun!

Thanks for reading. If you found this guide helpful, here are a few more articles you may like: