AI image generation has taken the world by storm in the past few months. With these new AI systems, a user can type a text description, and the AI will create an image based on the prompt! It's incredible what people have created with these models. But the model used can lead to highly different results. Dall e 2 by Open AI and Dall e mini by Craiyon are two models people search for and are often confused because the names are so similar.
Are Dall E 2 and Dall E mini similar models?
The difference between Dall E 2 and Dall E mini:
Although they share a similar name, these two AI image generators are not the same. OpenAI developed Dall E 2, leveraging previous work on GPT3, uses a diffusion model, and has many more parameters. Dall E mini is inspired by the Dall e research paper and a research project developed by Boris Dayma. It's open-source and is supported by Google TPU Research Cloud funding.
More importantly, the results you get from each model are wildly different. Dalle E 2 generates higher fidelity images that accurately reflect the prompt given, outperforming Dall E Mini's results. However, Dall E Mini is an impressive accomplishment by an individual with limited funding and resources.
Sidenote: Our team at Freeway ML are big fans of Boris and appreciative of his contributions to the open source community.
To eliminate the naming confusion going forward, Dall E mini has rebranded to Craiyon. A fun play on drawing and AI, but hard for folks to remember and spell correclty. Looking at Google search results, you'll see many misspellings of the name: Crayaion, Craion, Crayon AI.
AI Image Generation Results - Dall E 2 vs Dall E mini
Although there are content limitations, Dall e 2 results are amazing, especially if you know how to write prompts. It's the state of the art in AI image generation today (Although Freeway ML has a formula that outputs Dall e 2 - like quality with Open Source models).
Let's dive into some prompt examples so you can be the judge on which one is better.
Prompt with Faces
The results are consistent with Dall e 2 generating higher fidelity images in every case. Dalle mini has a good understanding and, in most cases, provides an image that matches the prompt but lacks the image fidelity of Dall E 2.
What you need to know about Dall E 2
In January 2021, OpenAI released Dall e, the first AI image generator that creates images from text captions expressable in natural language. It was a 12-billion parameter Generative Pre-Trained Transformer (GPT) model that leveraged the 175 billion parameters of GPT-3 to create an image model (Image GPT). The model was trained on text and image pairs, which are descriptive captions of images on the internet.
The Dall e model was updated to Dall e 2 in April of 2022 with the ability to create more realistic images, support for more prompt styles, and in-painting. Dall e 2 uses a diffusion model with CLIP image embeddings. Diffusion models are the latest approach where the image starts with random dots, like the static on a TV screen, and gradually works towards the pattern of an image that matches the text prompt.
OpenAI has released access to Dall e 2 via a web UI where users can type in a prompt and get a result. On August 31st, OpenAI said more than 1 million users were invited into the Beta program, including 3,000 working artists. Even after 1 million signups, there is still a waitlist queue for new users looking to access the web UI. You can join the waitlist here.
The code for Dall e 2 has not been released, and there isn't any API access making it a challenge to build plug-ins or run larger jobs.
Dalle 2 also comes with content restrictions designed to protect users. These are a few items taken from their Content Policy:
- You must clearly indicate images are AI generated and attribute Open AI when sharing.
- You cannot upload images of people. All uploads of realistic faces are prohibited, even when the face belongs to you or if you have consent. - (Note, this was changed on September 20th, you can now upload faces)
- You cannot attempt to create images of public figures (including celebrities)
Dall e 2 is a credit-based system. It costs a credit per "generation," and the output is four 1024 X 1024 images. A new user gets 50 free credits on their 1st month of use and 15 free credits a month after that. If the user converts to a paid account, they can purchase 115 credits for $15.
What to know about Dall E mini (now Craiyon)
What is Dall E Mini (Craiyon)?
Craiyon, or Dall E mini, is an open-source text-to-image / AI Image Generation service created by Boris Dayma and inspired by OpenAI's Dall-e paper. Developers can download the model, and users can write prompts at https://www.craiyon.com/
Boris Dayma is a machine learning consultant and creator of Dall e mini. It's an open-source project inspired by OpenAI's Dalle paper. After reading the paper, Boris waited six months before starting to build the project; his catalyst to action was a Hugging Face and Google Hackathon, where he had access to TPUs. This event was the origin of Dall E mini.
The first version of Dall E mini was trained on 400 million image and text pairs, scraped from the web, and curated by Boris Dayama. The technology of Dall E mini is similar to the original Dall e approach, which used Zero-Shot text to image instead of Diffusion and is more computationally efficient than Diffusion since it tries to predict the entire image simultaneously. However, recent advances are showing that diffusion models can create higher-quality AI generated images.
The Dall e mini model became such a big success that Boris obtained the resources to train a new larger model named Dall E mega. The Dall E Mega model is powering Craiyon's site, but you'll still see this referenced as Dall e mini.
Dall E mini is open-source on an Apache 2004 licence, making it free to use. The code and model are accessible from a Hugging Face or Git Hub repo. Even the model weights are available as a Weights and Biases artifact.
If you're not a developer and want to test out the model with a couple of prompts, you can try a cloud-hosted version at : https://www.craiyon.com/
Unlike Dall e 2, there aren't any content restrictions for Dall e mini.
Craiyon / Dall e mini ad-supported and free to use. A large community is using the service (in June 2022, the service supported ten requests per second), so it can take some time to generate results. This article goes into the infrastructure to support the web UI.
OpenAI's Dall e 2 and Craiyon's Dall e mini are very different AI image generation models, organizations, and business models. If you're interested in creating your own AI generated images, consider signing up for Freeway ML's image generator. For the latest news in the space, subscribe to our newsletter.