I find it interesting that DALL-E still doesn’t understand text - look at all the random Daschuund spelling in the generated images. It knows what the word should look like, but has no framework to interpret or distinguish text from the other elements of the image. It looks like trying to spell in a dream.
Thanks for this suggestion, never seen these guys before but they’re incredibly talented and very enjoyable to watch