Me Vs DALL-E 2 (The AI that makes paintings)

Introduction

So, I got access to the revolutionary AI, DALL-E 2, and I tested it as an art tool against the traditional way I make artwork.

For those who don't know, DALL-E 2 is a recently built AI that can give you stunning new artwork, within a minute, from simple word prompts. To give you a sense of what DALL-E 2 can do, here are some examples.

None of the artworks above actually existed before DALL-E 2 created them on request.

An interesting question now is, what does any of this mean for the future of art. I have discussed this at some length on this blog post. For now, the summary is- art appears to me to be one of many means to express and store thoughts. Is DALL-E 2 an artist? The answer to this question would depend on whether DALL-E 2 can have thoughts. If we assume that for now that DALL-E 2 cannot have thoughts, then it can be instead seen as a tool that can be used by people to express themselves. But now, the question is- how good is DALL-E 2 as a tool of expression?

This is the subject of this blog post. I will try to recreate some of my paintings through DALL-E 2. The idea is, how good is DALL-E 2 as a tool if I want to visually express my thoughts. Is it better than just picking up a pen or brush and doing it the old school way? Let us see the results.

Where Dalle performs well

Let us first see some paintings where DALL-E 2 did a pretty good job.

Mansi in a black and white room

My version

DALL-E 2's version

The word prompt that I used was- 'A girl in a black and white room, sitting on a chair, staring intensely at a red apple kept on a table in front of her. There is a blackboard with equations and bookshelf in the room. Everything except the apple is black and white. melancholic, desaturated, somber, solitude, oil painting'

DALL-E 2 misses some aspects, like the room is not completely black and white as I had wanted. Also, I would have wanted the apple to be more saturated. Nothing that cannot be fixed with some post processing. Overall, DALL-E 2's work looks quite good.

Mother

My version

DALL-E 2's version

Prompt- 'A mother with a new born baby asleep on her chest, dark background, pencil sketch, nostalgia'

Well, the result is good!

Father
My version


DALL-E 2's version

Prompt- 'A father riding cycle in dark night with an infant on the front seat, pencil sketch, nostalgia, far away, silhouette'

The result looks good, but needs retouching. I like my version better.

The nightingale and the rose

My version

DALL-E 2's version

The prompt used- 'A nightingale bird painfully chirping in a full moon night, with a thorn from a rose piercing its chest, causing an injury. A rose is turning red from white. The Nightingale and the Rose, Oscar Wilde, ominous, sinister, terrifying, dark, oil painting'

Well, to give DALL-E 2 credit, at least the output looks good and something in the line of what I wanted. But the thorn piercing the chest part is missing. Which takes away the whole meaning from the painting. Makes me wonder- can DALL-E 2 understand thoughts complicated enough?

Existential Crisis

My version


DALL-E 2's version

Word prompt- 'Portrait of a boy staring shocked at the viewer as the world around him is dissolving. Dark, Ominous, haunting, apocalyptic, existential crisis, sinister, oil painting.'

Now, let there be no misunderstanding. DALL-E 2's output looks really good. But did it express the thoughts that I wanted it to express? Not exactly. The boy's expression is in line with what I wanted, but the 'world around him dissolving' part, if it is present, has not been sufficiently emphasized. 

Basically, I do not know how to tell DALL-E 2 exactly what output I am looking for. Some post processing is hence required.

Where Dalle does some hilarious work

Now, let me move to some downright hilarious results.

Green sky

My version


DALL-E 2's version


Word prompt- 'an isolated island in the shape of human face with green aurora in the night sky, solitude, sadness, oil painting'

This output begs a question. Can DALL-E 2 actually understand complicated ideas? Are word prompts really the right tool to get a visual image of one's thoughts?

Black and white

My version


DALL-E 2's version


Word prompt- 'A blind man in front of a white screen with his right hand disappearing in the screen, and another hand appearing from the screen, oil painting, black and white, somber, thought provoking'

Ok, so let me try to understand what is going wrong. If I had to go to a talented artist with this idea, it would take us a lot of exchange of words before the artist understood what I wanted her to draw. She would ask me follow up questions and I would try to clarify. Here, DALL-E 2 is expected to confidently churn up results from at max 400 characters. DALL-E 2 does not ask follow up questions on parts that it did not understand, like what does it mean for a hand appearing from or disappearing into a screen. All that DALL-E 2 does is confidently map any input to any output. It does not wait to say- 'hey, what you are asking does not make sense to me!'

Where I don't know what word prompt to use to describe the painting

Now, there are some of my paintings for which I don't even know what prompt to give to DALL-E 2. Like the following ones.




Perhaps this hints towards the inherent deficiency for words that paintings try to fill in. I mean, it is always possible to encode every artwork as a series of 0's and 1's, which, when looked at in a way, can be seen as a word. But human beings cannot appreciate the 'beauty' in those 0's and 1's in the same way as they can appreciate the beauty in the corresponding image. Human beings can appreciate 'beauty' in the words, but then we are talking about a poem or story made up of dictionary words.

Mature paintings that DALL-E 2 wants nothing to do with

Then, there are some paintings for which I cannot get DALL-E 2's version because for one reason or another, it violates the terms and conditions of usage, mainly because of nudity or horror involved. These paintings are-


Dalle apparently does not like hands entering skulls.

Perhaps the idea of putting a baby and old man in the same picture triggers some sore nerves.

Nudity.

Conclusion

DALL-E 2 as a tool to express one's thoughts is immature right now. This might be partially an issue with present technology. Partially, there might be a fundamental flaw in the whole idea of getting an image from words. Not all images can be sensibly reduced to words. There appears to be conflict between two media of expression. They become useful in different circumstances. I do not want to talk to people around me through paintings. But there are some thoughts best reserved for visual representation.

If so, it appears that the whole idea of reducing thoughts first as word prompt and then turning that into an image misses the point of why we have been using these two different media of expression in the first place.


Comments