3.3 KiB
+++ title = "A story about the logo" description = "Automatically generated art" date = "2023-01-19"
[taxonomies] tags = ["stable-diffusion", "blog"] +++
The existing logo
I have used the same image in my Gravatar for many years, but the thing is... I can't find the source image any more. It has long since been lost to the sands of time and format C: /q
. If you are unfamiliar with it, basically it is a diamond-shaped yellow road sign with Tux, the Linux mascot on it. I think that I found the original in a Linux magazine in the early 2000's, but there is no telling any more. It's time for a fresh look, anyways.
My home setup
I like to work on my desktop when I am at home. I have a wonderful Dell Ultrasharp U4320Q display, a still-decent i9-9900k processor, 64GB of DDR4 memory, and 2TB of NVMe primary storage. I recently used my Juno points (a self-help like service that gives you money each month to do something for yourself) to buy a used NVidia RTX 3090 GPU. This is kind of a big card; it requires something like 400W of power, two-and-a-half slots, and sports 24GB of GDDR5 memory. At the time of purchase, the RTX 4000 series was just announced, so while not quite top-of-the-line, it was still the newest available at the time. To make availability possible, Etherium had recently forked to a proof-of-stake model rather than the existing proof-of-work. This effectively made all of the at-home miners obsolete and there was a corresponding flood of GPUs on eBay.
With the new GPU, I can finally run games at the native 4K resolution and respectable framerates to boot.
Stable Diffusion
I was incredibly excited to hear about the announcement of DALL-E. The prospect of having a computer "just" make me an image was exciting. However, once I got an invite to the platform I realized that I was not impressed with the pricing model or the ineptitude of the generator. With having to pay-per-image generated and then having to cycle through so many images to get one good render, it seemed like a waste. But then I discovered Stable Diffusion. It was magical. I made it work in Windows git-bash
and now I have a self-hosted generator running on my GPU that can make an image in something like 4 seconds.
Self-hosting also has the benefit of being able to load different models like Stable Diffusion 1.4, SD 2.0, Waifu Diffusion, etc. I also have a choice in what interface to run. The in-built interface with the core Stable Diffusion is a CLI python script. It works great, but not so good unless you are sitting in front of a keyboard. For my current situation I have settled on AUTOMATIC1111/stable-diffusion-webui. It's got a lot of knobs to twist and options to change.
Final Result
For this particular image, I had to generate around 50 different images to get one of "tux the penguin" on a "four sided yellow road sign" without too much in the background or whatever strange characters it thinks are words.