Words by: Trevor Burgess
I studied Literature at University at a time when semiotics and structuralism were dominant in de-constructing literary texts. There was a tendency in a department of literature to think of language as the fundamental way in which we understand the world. At its most extreme, some lecturers declared that the world is a text to be read.
Now, neurologists tell us that at least 40% of our brain’s activity is involved with vision. I make paintings and have been involved in practising, studying and teaching drawing and painting all my life. It is my experience that parts of my brain that process language go dormant or switch off when I am engaged with making a painting. People who know me as a talkative soul might be surprised that when I come out of the studio after a long painting session I am a bit mute and find it difficult to put words together.
One of the most influential books on teaching drawing explains why this is so. It is called “Drawing on the right side of the brain” by Betty Edwards. It’s based on an understanding of the neurobiology of the brain, which tells us that, for most of us, verbal, analytic, sequential functions are mainly located in the left hemisphere; and visual, spatial, perceptual functions are mainly located in the right hemisphere. The book is full of practical drawing exercises designed to help learners to switch off the logical left side and turn on the intuitive right side of the brain.
What has this got to do with AI? I have been struck that amongst all the marvelling at how AI can “generate images” in a way that attempts to mimic the workings of neural networks in our brains, yet there has been little commentary on the fact that the dominant models it uses are based on language. The fundamental skills that need to be learnt to use most AI image generation tools effectively are how to write verbal, precise, sequential prompts – precisely the left-brained functions that interfere with visual and spatial perception in learning to draw and paint. There are alternative “controlNet” sketch to image and pose to image models such as Scribble or Leonardo’s “Realtime canvas” which are not based on text prompts, but the dominant models are not trained on spatial perception. What they do seem to be trained to do visually is pattern recognition. This connects with the processes of drawing and painting related to the 2-D surface, but is not very useful in constructing a convincing space within an image, and it partly explains why AI can put together images of a recognisable building or a landscape if you describe it in words, but finds it challenging to convert a 2D plan into a 3D visualisation.
So how do those of us who teach the visual arts skills that go into the physical production of drawing, painting, sculpture and crafts adapt to using this new tool? They say the new technology is disruptive. Do we have to adapt our right-brain activities to follow the left-brain data-processing of the machines? There is no avoiding learning how to prompt. But let’s bring to it our own creative right-brain disruptions.
I saw a hilarious example on the web of somebody recommending the best AI image generators who, as an example, had prompted the AI to make a cloud look like a dog. The AI produced a cloud with a dog’s face sort of stuck on the top of it (left image). The prompter then apparently went through a full 120 prompts to make the cloud look successively more like a dog and was clearly proud of the result. The result? (right image) Err… A cloud with a slightly more detailed dog’s head stuck on it. As much as I try to see the cloud as a dog, I just can’t see the form of a dog in it: no paws, tail or legs. Can you?
That is left-brain thinking. Keep on, keep on, keep on successively through stage after stage after stage towards a desired result. If you ask me, after 120 prompts the cloud looks even less like a dog: for example the bit of cloud at the bottom right that did make me think it could be the dog’s tail has got detached. The AI has cut the dog’s tail off! What about some right brain thinking: make the cloud a rain cloud? Take the dog for a walk. Ask the AI image prompt to make the cloud bark or put the dog’s tail back on? I don’t know – anything to throw up more weirdness and get something unexpected.
Talking to some visual artists who have engaged with these new tools, it is exactly the weirdness and strangeness of what AI throws up that excites them, and they worry that as it keeps getting more and more precise and controlled, the unpredictability is being lost. The rhetoric around AI is about doing things faster and more efficiently, and of course we can take advantage of that. But, if we are thinking about AI text to image generation as a creative tool, maybe we take a lesson from Betty Edwards’ book, and switch off both our visual training and our results-orientated goals for a moment and just think of it playfully as a sort of digitally hyper-active collaging process, arranging pre-existing materials in weird and wonderful ways, sticking dogs and clouds together. Have fun!