I finally carved out the time to dig into Midjourney more deeply, after doing a lot of experimenting with GPT in the early part of this year.
Both of them make me think a lot. With GPT, I mostly think about what it means for other people, what it means for my practice as a reader and editor of text. It doesn’t draw me in at all in terms of my own work as a writer except in the most speculative long-term sense. I find it easy to understand what GPT is doing and to bracket it off in various ways.
With Midjourney, I’m more like a good friend and colleague of mine, who was the first person I know to really be thinking in new ways about what this thing is. I’m not just fascinated, not just taken with the experience of using it. I feel like I’m touching on something meaningfully non-human, some spirit, like I’m a medium. Something a bit dangerous, but also a song you can’t help but consider swimming towards.
When you tell people you’re having an experience that you would class as spiritual in some sense, for many people, that’s an entirely positive association. They think you mean you’re uplifted, that you’re feeling some kind of angelic or divine presence.
I don’t mean that at all. I mean that Midjourney feels for me like I’m drawing pentacles in order to commune with a tricky, unpredictable, sometimes eldritch presence. Something uncanny. Something that might possess me for a while. Something that needs offerings, that needs to be propitiated. A thing which is outside of me that might or might not give me what I am asking for, that is unpredictable, whimsical, willful, that is infected with recognizable, sometimes exaggerated, human impulses but that is also alien, otherworldly, uncanny. That might not be there at all: a glimpsed movement out of the corner of one’s eye, a trick of the light.
I like it.
We have a tendency to regard appreciation for an artificial intelligence of this kind as a novel folly born out of the hype of Big Tech or at least the advent of computers. People remember that William Gibson coined the metaphor of cyberspace without having used a computer (and they do not remember that in the sequel to Neuromancer, he analogized the relationship between human users and freed AIs in cyberspace to spirit possession, the AIs being loas).
But the use of some kind of thing, some sort of device-outside-ourselves, to jumpstart or structure creative processes, especially in art, is an old concept. Found art, collage, automatic writing, Oulipo, Plotto, Dogma 95: systems, rules, devices, randomizations that we apply to creative labor that take away some of our agency, that let some strange attractor pull or push what we do away from ourselves.
We have been using techne also to simplify, iterate, speed up artistic creation. You might or might not buy David Hockney’s theory of the camera obscura and camera lucida as devices whose materiality lured Western art into a long obsession with perspective and mimesis but it is hard to question the massive prevalence of tracery and forgery in Western and global art over the last five centuries. Learning to reproduce a specific artwork via subdividing it into tiny squares feels like a magic trick the first time you do it—and it is only then that you realize it’s a magic trick that tens of thousands of people know and have used in various ways.
Midjourney in this sense feels (at least to me) continuous with the history of visuality and techne. And with the idea of looking for a thing—a rule, a system, a device—that can intervene in the process of making visualizations. Writing is not always communicative, but communication and writing (and thus writing and speaking) have strong relationships. Visuality can be communicative, but it’s also tied up in our umwelt, in the intensity of our relationship to sight, and thus not just of the world we see around us but the many many worlds we cannot see but can imagine seeing. We know there’s a huge difference between my writing about a historical event and my seeing it.
And because the techne of visualization and the techne of writing are so different, and the former considerably more challenging in various ways (despite the centrality of visuality to our umwelt), the barriers to visualizing feel frustrating for anyone who is not both trained and talented as doing so.
Experimenting with Midjourney with some layman’s understanding of how it is doing what it is doing takes a bit of the uncanniness out of it, to be sure. It is not willfully withholding nor graciously providing the outcomes that a prompt is plainly seeking. But no matter how skilled a writer of prompts you are—both skilled in controlling the visual genres and information that structures a good prompt and in finding conceptual frames that Midjourney can work with—the results will never be quite what you had in mind. Sometimes far from it.
As with GPT, if you know something about the vast but also sometimes painfully finite corpus these AIs have been trained from, at the very least, outcomes to your queries are showing you history through a glass darkly, crackling with the biases and absences and distortions that you would otherwise only begin to perceive as a reader or viewer after a lifetime of study and experience. (Or would learn about by trusting the authority of those who have.) I have an inside lane for some of those exclusions, because I know very well just how much African history, African lives, African viewpoints, are absent from both the written and visual corpus of modernity. Even when there are words by and for Africans, about African realities and African imaginaries, they are found in the side eddies of social media, they are unrepeated or under-reproduced in the standard worlds of print media, they are relatively absent from databases. You may well have seen the many maps of the African continent at night and noticed how many fewer lights there are. A blessing for astronomers and animals, perhaps, but you could see those maps too as a chart of data presence.
So I know that one way to quickly push GPT to failure is to ask it to simulate text about Africa, for Africans, by Africans. So too it is with Midjourney, in a way. I spent quite a while trying to get it to put the absolutely characteristic—and relatively well-pictured—style of mosques that predominate in the upper and middle Niger in scenes of West African history between the 12th and 17th Century. I can get something of what I have in mind in various ways, but not really. I can get scenes from South African history but the people involved won’t be wearing the remotely correct clothing—they’ll be in some kind of generically non-Western, orientalist clothes. As with many subjects, Midjourney balks at mixing two different kinds of human subjects. If you ask for scenes of Black history, it will generally render all the participants as Black. Asking for an image of Queen Victoria meeting the Zulu monarch Cetshwayo took many, many iterations before Victoria became white.
(In order: Tiyo Soga preaching; a debate in a township between ANC and PAC followers in the 1960s; Krotoa aka ‘Eva’ in exile on Robben Island; Nongoloza in prison in the Transvaal.)
I am not annoyed. I am interested. I am in my pentacle, talking to the djinn. It is trying to twist my wishes. Occasionally it does so enough that I am bothered, even distressed. I have been trying to make D.F. Malan dance the toyi-toyi (because I know it will violate community standards to have the toyi-toyi dance on D.F. Malan, among other things) and so far I have not succeeded at all. It is what djinns do, legendarily: make your wishes go awry. I am learning from what comes back. Pythia did not speak clearly to those gathered at Delphi, but they sought for meaning in the words nevertheless.
I have some understanding of photography—which in a digital age, also means some skill with technologies of processing photography that are in their way unruly demons as well, not always doing as expected. But I can’t make all the images I can describe to myself, not even remotely. Some I can imagine how I might, but the studio I would need is far beyond my financial means and using it beyond my skill level even if I had it. Some I can guess at ways to work photography and Photoshop towards. Others are in a different bucket of techne and well beyond me even if I were to drop everything and re-enroll as a student at an art school.
You might say—and at least one of my readers has suggested—that I should then hire artists who do have those skills to make what I envision. Not being independently wealthy puts the kibosh on that idea, whether I am thinking illustrations of this modest Substack or for pure personal consumption. It is more than that, however. My few experiences of trying to explain in words an idea which is visual in nature have been mortifying. I have rarely felt as stupid, as inarticulate. Or as presumptuous, because in trying to order up a precise image that you can see in your own head, as clear as if it were in front of you, you realize that someone else skilled in visualization has their own ideas, and ought to. There are ways to cross that line—comic-books (or sequential art, as Scott McCloud would have them) have a set of creative norms that structure the relationship between writers and artists that can vary wonderfully within different collaborations. (Some writers make sketches; others write detailed descriptions of visualizations; others just tell artists ‘you figure it out, this is what people are saying, this is what the story is’.) But it is hard, and legitimately expensive, and not something you do for shits and giggles.
Midjourney, though? You can wrestle with its unruly, uncanny, drifting spirit and do that image that you would never waste money and human labor on, and find it satisfying. Even moving. When you cast the right bait upon the waters, Midjourney comes up from the corpus like a Jungian leviathan from the depths, sounding mightily across the waters. You sought a thing, and my god, here it is. And yet it is not yours. The geist, patterned and pushed inside the algorithm’s many-chambered black box, has many wishes in it. It is truculent, coy, spiteful. You are not unhappy at the waste of some artist’s time and your money. There is always another prompt, another variation, another theme to work.
So I can see what I’ve had in my head for some time: satires and subversions of those horrible stock photos of white-collar offices that infest Unsplash and Shutterstock like cockroaches in a dirty kitchen. Gentle ones, mind you. (So far, but who knows where my adjectives might drift.) I can see unpictured things, drawn from a massive slurry of made pictures.
Do artists need to defend themselves against Midjourney? (As writers against GPT?) Yes, certainly. There will be, already are, attempts to use AIs as thin prophylaxis to defend outright appropriation of specific works with ongoing value, as a burglar’s alibi. I worry less about the outright substitution of Midjourney for human artists in many high-value employments, much as I don’t think anybody’s going to be using GPT to write movie scripts or novels except as novelties or as replacements for work that is already semi-automated or derivative. (Ghostwriters and ‘polishers’ might want to worry some.) The painter who makes a bit of side money painting the same scenes of sunsets and oceans that every other painter at the booths in the summer art festival in the little town might be in trouble, but they have long since been operating in a context where what they do is already a humanoid’s algorithm. (I recall walking around a relatively big and actually juried show at Rittenhouse Square last summer and counting no more than about five sellers who had craftwork, paintings or photographs markedly different than anybody else’s stuff.)
Whomever it is that makes art for motel chains and doctor’s offices, if it’s a hand-crafting artist, is going to need to make a shift. Whomever it is that designs mascots or characters for commercial businesses will need to rethink slightly. But these are as much new jobs as replacements. The boss of Midjourney and its descendants, not their rival. You can already see the difference looking over the channels, with the vast petabytes of imagines pouring through their sluice, between the skilled prompters and the punters, the future AI bosses and the people just fooling around with it for a bit.
I can even see a new workflow for the skilled artists. Rather than having a client who is inarticulate in commissioning, the client might say, “Here, something like this”. Perhaps if I found a graphic artist to collaborate with one day, as Trevor Getz worked with Liz Clarke on Abina and the Important Men, I could say: look at this! This is sort of what I think this looked like, and become a different kind of partner in the process. (A lot of children’s picture books work off of these kinds of collaborations, too.)
In the end, I’m not terribly drawn right now to these questions. Perhaps for the same reason that I rushed into blogging despite the warnings of several of my friends who were professional writers that blogging was going to kill journalism, going to devalue writing. It may have done that a bit, but the hammer blows came from less expected places: none of us were talking about the importance of classified ads as revenue for print media until they were already gone.
I rush into Midjourney because when I talk correctly to the spirit, I get some mysterious fragment of revelation, of desire, of inspiration, that I could not have gotten before I began to learn its rituals. You might know exactly how the hot springs make the mists rise around Pythia and how she crafts an otherworldly voice, but just try to ignore the temptation of oracular insight.
John Warner judges that generative AI is not creativity. I don’t know that I disagree. I just think it might be the wrong judgment to be passing. Or that maybe it’s time for the modernist understanding of creativity, that it is the masterful act of a singular mind, to go away once and for all. There is a sense in which I might be happier begging for a ghost to fetch up a vision from the collective unconscious than I am buying the labor-time of an artisan.