AI Image Generators (Craiyon, DALL E, Midjourney, Stable Diffusion)

hansioux · September 13, 2022, 2:26am

He’s actually incredibly average looking as a half Asian…

Dr_Milker · September 13, 2022, 3:17am

Looks like a baddie in a 1970s kung fu movie.

hansioux · September 13, 2022, 3:20am

I can’t exactly place what is different with those images and actual Brad Pitt. I mean some of them are clearly recognizable as Brad Pitt. Is it just the eyes? It’s messing with my mind.

Dr_Milker · September 13, 2022, 3:23am

I think mostly the eyes, but also nose is wider and nostrils more flared. And obviously hair color.

slawa · September 13, 2022, 3:34am

Posthumanity

Terminator and Robocop on the train

slawa · September 13, 2022, 3:40am

butter-robot-sad

yyy · September 14, 2022, 2:25am

But how can they tell the difference?

@discobot quote

discobot · September 14, 2022, 2:25am

Enjoy the little things, for one day you may look back and realize they were the big things. — Robert Brault

qwert_zuiop · September 15, 2022, 8:02am

qwert_zuiop · September 15, 2022, 8:16am

To get similar results with craiyon, I used randomly generated passwords as an input to craiyon:

hansioux · September 15, 2022, 9:00am

During training most of the model is given an image and text description pair. The model looks at an text input, converts it to a vector representing this input that supposedly captures the meaning of your text, then it uses that vector to try and generate an image. It will either compare that image with the original image in the input data pair and tweak the model to get it to generate an image as close to the original as possible, or use another model to determine if it is an real image, or one generated by a model.

For the vectors, you can literally enter random numbers and it would still generate an image. That’s essentially what’s happening when you enter a random string.

qwert_zuiop · September 15, 2022, 9:16am

Yes - the question the article raises is basically just how these networks react to “non-traditional” input (e.g. drawing the opposite of something). Although one might expect random images to be created, this is actually not the case.

Craiyon doesn’t seem to be able to handle negation that well “not a flower” will also produce flower pictures. However, I still find it interesting how it reacts to “gibberish” as an input.

hansioux · September 15, 2022, 9:24am

It isn’t random because the model looks at the high dimensional feature vector generated from your text input, and see how close each feature is to the rest of the training data. Since each text would only have one feature vector generated by the model, each image generated by the model, usually with random perturbation added to generate different images, would largely stay on the same theme, since the feature vector is closest in high dimensional space to that theme.

So the feature vector generated from GGfbFFLN4Z obviously corresponds to some type of portrait in high dimensional space, that’s why every image generated with that vector would be some kind of portrait.