Generating SVG Images with ChatGPT
TL;DR I used ChatGPT, with varying levels of success, to generate SVG images. With enough patience and prompting, you can get it to draw basic shapes and simple scenes about as well as a five year old can. (Which, hey, let’s face is, is good enough a lot of the time.)
Background
I like the idea of using AI’s to generate images. Go back in this blog and you’ll see lots of posts and casual references to imaging networks I like build and experiment with.
With recent advances like DALL-E, Stable Diffusion, and Midjourney, it seems like the static image generation problem has been solved. Sure, there’s plenty of room for growth and improvement, but the basic problem of generating images from text is solved.
But what about vector graphics? As an app developer, I’m often in need of graphical resources. Images are neat, but they’re also so 1970s. They don’t scale well and are a terribly wasteful and inconvenient (for editing) encoding of graphical information.
Like everyone else, I’ve been using ChatGPT a lot recently, and I was curious to see if it could generate SVG code for images. I know it’s pretty good at HTML, so I figured it might be able to do SVG too. It’s certainly not something it was designed to do, but I thought it would be fun experiment to conduct.
To my delight, it actually works. You’ll see below that the results are not always great or, um, err, good, but it’s still pretty cool and useful that it can do this at all.
As of GPT-4, I would say it’s capable of generating decent shapes and simple combinations of shapes. But I’ll let you you judge.
Experiments
The following transcripts are from my first experiments generating SVGs with ChatGPT. I was curious to see what kind of images the model could generate, and how it would respond to different inputs.
I performed the experiments using the ChatGPT API using the gpt-4
model.
I used my app AskGPT to run these prompts and dutifully pasted the responses into SVG files to make this post.
Please generate an SVG of a catus [sic] out in the desert on a sunny day, no explanation.
As an AI language model, I cannot create or directly display SVG images. However, I can help you create the SVG code, which you could then use in any SVG viewer or compatible software/website to visualize the cactus in the desert.
Here’s a simple SVG code for a cactus in a desert environment. Copy and paste this code into an SVG viewer or an HTML document to display the image.
<svg width="300" height="150" xmlns="http://www.w3.org/2000/svg">
<!-- Background color (sky) -->
<rect width="300" height="150" x="0" y="0" fill="skyblue" />
<!-- Sun -->
<circle cx="270" cy="30" r="20" fill="yellow" />
<!-- Ground -->
<rect width="300" height="50" x="0" y="100" fill="tan" />
<!-- Cactus -->
<rect width="30" height="80" x="130" y="70" fill="green" />
<rect width="20" height="30" x="125" y="50" fill="green" />
<rect width="20" height="30" x="145" y="50" fill="green" />
</svg>
SUCCESS Although the initial part of the response was discouraging (why the low confidence ChatGPT?), the model dutifully generated the SVG code for a cactus in the desert.
The cactus could obviously use some work, but this image is mostly what I was expecting. I know these models are sensitive to inputs, so I decied to give it some encouragement.
OK, great. But please now generate a much more detailed image using gradients and paths
Here is a more detailed SVG code for a cactus in the desert using gradients and paths for a richer visual representation. To view the image, copy and paste the code into an SVG viewer or an HTML document.
(I’m going to elide the SVG code from now on, but you can see it by viewing the source of the images.)
SUCCESS OK, so it’s not going to win any rewards with that, but I applaud the effort. The cactus has a bit more shape and the horizon is more interesting.
It looks like the model is having a hard time getting the directions of arcs correct in the SVG code. Golly, I can relate. I’ve written several SVG libraries and still I can never get the ordering right.
That said, I feel like I’m failing the network by not being a good prompter. Let’s try again with stupid amounts of prompting.
Please generate an SVG of a cactus in the desert on a sunny day. Please make it as detailed as possible. Use vibrant colors. Use dramatic lighting. I want depth! I want to feel like I’m there! I want something shockingly beautiful.
I apologize, but as an AI text model, I cannot create images or SVG files. However, I can help you understand how you can create such an image using a vector graphics software like Adobe Illustrator or Inkscape.
- Open your preferred vector graphics software and create a new document.
…
FAILURE Oops, I went too far and it just started explaining to me how to use a drawing program. I guess it doesn’t like that kind of prompting.
Let’s change subjects and see how it does at generating icons.
App UI Icons
Would you generate the SVG code for a mac toolbar icon that toggles the selected item preferences pane on and off?
As an AI, I cannot generate an SVG file directly in this text interface, but I can provide you with the code that you can copy and paste into an SVG file, like ‘toggle_preferences_icon.svg’.
Here’s a simple SVG code example for a mac toolbar icon with two gear-like symbols representing the toggling of the selected item preferences pane:
FAILURE Haha, OK, um so those aren’t gears. E for effort.
Can we just make a single gear if we concentrate really hard?
Please generate an SVG code of a gear icon. Please make it detailed showing the teeth and such. You can use paths if you want. Make your response as long as it needs to be.
Here’s an SVG code representation of a gear icon using paths:
FAILURE I see gears aren’t its strong suit.
Let’s see if it can generate something more basic.
Please generate an SVG code of a house icon. You can use paths if you want. Make your response as long as it needs to be.
SUCCESS? Do walls need to be straight? Does it matter if your roof has a little overhang? That’s a house. I’ll take it. Even better, I think an app with a “messy icon” aesthetic would be a welcome breath of fresh air.
Enough with boring toolbar icons. Let’s see if it can make a new icon for my app iCircuit.
App Icons
Please generate the SVG code of an app icon for iCircuit, an app that enables you to draw and simulate electronic circuits. The icon should have a nice graident background and fit in well on the iOS home screen. You can use paths if you want. Make your response as long as it needs to be.
FAILURE I dig that gradient (those colors are pretty close to the actual app icon), but the logo is a bit too… abstract. I can forgive abstractness, but I can’t forgive the fact that it’s not centered properly.
Let’s try prompting our way out of this hole.
Nice, I like that gradient. But the inner logo is too weird. Maybe put a nice looking sine wave in there. Centered nicely.
FAILURE The sin wave is so close! If I were generous, I would grade this one a “SUCCESS?”, but it failed to center again, and I just can’t have that.
Art
Please generate the SVG code of a nice looking abstract art piece. You can use paths if you want. Make your response as long as it needs to be.
SUCCESS I like it. It’s very “Saved by the Bell”, very 1990s.
That gives me an idea.
Please generate the SVG code of a spaceship using awesome neon colors and rad gradients.
FAILURE Cool colors, for sure, but it’s not a spaceship.
OK, it’s not an artist. Let’s see if it can do simple shapes.
Please generate the SVG code of a red circle. Inside of that circle put a white square. Inside that, put the red text ‘THIS IS A WARNING’. Wrap the text if needed. Use some gradients to make it look good an intimidating.
SUCCESS The text isn’t centered. I get it, it’s hard. I’m expecting a lot out of a language model.
But look, it’s text! Readable text. That’s a big deal because imaging networks aren’t really known for their text prowess. For example, here are DALL-E and Stable Diffusion’s attempts at the same prompt:
At least ChatGPT gave me exactly what I wanted (I’m not thinking about centering, I’m not thinking about centering, …). I got a red circle, a white box, and readable text. Neither of the other two networks could do/did that.
Conclusion
My biggest takeways are:
- Yes, ChatGPT can generate SVG code.
- It’s still in a very early stage and cannot handle much complexity.
I was hoping for better, but for a network that was not designed to do this at all, it’s not bad. I certainly look forward to GPT-5 and 6 improving at this.
I keep thinking back to the last warning sign experiment. The images generated by DALL-E and Stable Diffusion were not good. They were not even really what I wanted. ChatGPT, on the other hand, was able to generate something much closer to what I wanted.
I think this is a benefit of how it was trained. The engineers worked hard on the “alignment problem” (pun unintended) and it shows. It does what I expect. Old GPT-3 and friends were pure statistical generators. During their training, the only positive feedback they received was when their output sampling match language use. ChatGPT, however, was trained with a bias towards being helpful and providing the kinds of answers people want. This was accomplished by having people hand-rank its responses. I have a feeling that this hand-ranking, in order to aid in the human alignment problem, is going to fuel the next few years of machine learning improvements.
I went into this wanting to see if I could get SVGs, what I think of as a more useful format for my life as an app developer, out of ChatGPT. I got that, but I also got a lot more. I got a glimpse into the future of AI.
Anyway, enough armchair philosophizing. I hope you enjoyed seeing the results of my experiements, and I hope you’ll try some of your own.
Notes
-
I am aware there are neural nets specifically trained to generate SVG code. DeepSVG is one example. Specialized networks give more reliable results today and will certainly be useful in the future. I wanted to experiment with ChatGPT because I am intrigued by its generalized knowledge and I wanted to see if that general knowledge produced interesting results.
-
None of this was done scientifically. ChatGPT is still a stochastic (random) model and its outputs can vary. I should have had it generate more samples for each prompt. But I didn’t. I just wanted to see roughly what it could do.