Article:
“Midjourney Can’t Count”: Questions of Representation and Meaning for Text-to-Image Generators

Abstract

Text-to-image generation tools, such as DALL·E, Midjourney, and Stable Diffusion, were released to the public in 2022. In their wake, communities of artists and amateurs sprang up to share prompts and images created with the help of these tools. This essay investigates two of the common quirks or issues that arise for users of these image generation platforms: the problem of repre- senting human hands and the attendant issue of generating the desired number of any object or appendage. First, I address the issue that image generators have with generating normative human hands and how DALL·E has tried to correct this issue by only providing generations of normative human hands, even when a prompt asks for a different configuration. Secondly, I address how this hand problem is part of a larger issue in these systems where they are unable to count or reproduce the desired number of objects in a particular image, even when explicitly prompted to do so. This essay ultimately argues that these common issues indicate a deeper conundrum for large AI models: the problem of rep- resentation and the creation of meaning.

Download icon

Published in:

Preferred Citation
BibTex
Wasielewski, Amanda: “Midjourney Can’t Count”: Questions of Representation and Meaning for Text-to-Image Generators. In: IMAGE. Zeitschrift für interdisziplinäre Bildwissenschaft, Jg. 19 (2023), Nr. 1, S. 71-82. DOI: http://dx.doi.org/10.25969/mediarep/22327.
@ARTICLE{Wasielewski2023,
 author = {Wasielewski, Amanda},
 title = {“Midjourney Can’t Count”: Questions of Representation and Meaning for Text-to-Image Generators},
 year = 2023,
 doi = "\url{http://dx.doi.org/10.25969/mediarep/22327}",
 volume = 19,
 address = {Köln},
 journal = {IMAGE. Zeitschrift für interdisziplinäre Bildwissenschaft},
 number = 1,
 pages = {71--82},
}
license icon

The item has been published with the following license: Unter Urheberrechtsschutz