AI Aesthetics

This volume investigates the intersection of generative AI and media aes-
thetics from an interdisciplinary perspective. Combining in-depth theoretical 
reflection with a diverse selection of case studies, its authors explore the aes-
thetic forms of AI-generated medial objects as well as the cultural imaginaries 
that the latter draw upon.

Bringing together a group of scholars from various geographic and dis-
ciplinary backgrounds, the chapters move within and across different 
conceptualizations of “AI aesthetics” that can be located in-between an 
“aesthetics-as-artistics” (that is primarily concerned with aesthetic judg-
ments related to skill and connoisseurship) and an “aesthetics-as-aisthetics” 
(that identifies all kinds of embodied perception as its object). The book thus 
reflects on both the theoretical and the methodological implications of “AI 
aesthetics,” while also demonstrating that this is still very much an emerging 
research field and that no dominant conceptualization of “AI aesthetics” has 
yet emerged.

Considering its decidedly international and interdisciplinary scope, AI Aes-
thetics: AI-Generated Images between Artistics and Aisthetics will appeal to 
scholars and students within media studies, cultural studies, literary studies, 
philosophy, art history, visual culture studies, digital humanities, and critical 
AI studies.

Jan-Noël Thon is Professor and Chair of Media Studies and Media Education 
at Osnabrück University, Germany.

Lukas R.A. Wilde is Professor of Media Studies at the Norwegian University 
of Science and Technology (NTNU) in Trondheim, Norway.


The Serial Podcast and Storytelling in the Digital Age 
Edited by Ellen McCracken 

Media Piracy in the Cultural Economy 
Intellectual Property and Labor under Neoliberal Restructuring 
Gavin Mueller 

Mobilizing the Latinx Vote 
Media, Identity, and Politics 
Arthur D. Soto-Vásquez 

Playlisting 
Collecting Music, Remediated 
Onur Sesigür 

Understanding Reddit 
Elliot T. Panek 

Algorithms and Subjectivity 
The Subversion of Critical Knowledge 
Eran Fisher 

TikTok Cultures in the United States 
Edited by Trevor Boffone 

Cypherpunk Ethics 
Radical Ethics for the Digital Age 
Patrick D. Anderson 

Esports and the Media 
Challenges and Expectations in a Multi-Screen Society 
Edited by Angel Torres-Toukoumidis 

AI Aesthetics 
AI-Generated Images between Artistics and Aisthetics 
Edited by Jan-Noël Thon and Lukas R.A. Wilde 

Routledge Focus on Digital Media and Culture 


AI Aesthetics 
AI-Generated Images between 
Artistics and Aisthetics 

Edited by Jan-Noël Thon 
and Lukas R.A. Wilde 

https://www.routledge.com


The right of Jan-Noël Thon and Lukas R.A. Wilde to be identified as the 
authors of the editorial material, and of the authors for their individual 
chapters, has been asserted in accordance with sections 77 and 78 of the 
Copyright, Designs and Patents Act 1988.

The Open Access version of this book, available at www.taylorfrancis.com,  
has been made available under a Creative Commons Attribution-Non 
Commercial-No Derivatives (CC-BY-NC-ND) 4.0 license.

Any third party material in this book is not included in the OA Creative 
Commons license, unless indicated otherwise in a credit line to the material. 
Please direct any permissions enquiries to the original rightsholder.

The Open Access publication of this book was generously supported by 
Osnabrück University and the publication fund NiedersachsenOPEN as part of 
zukunft.niedersachsen, a joint funding program of the Ministry for Science and 
Culture of Lower Saxony and the Volkswagen Foundation.

Trademark notice: Product or corporate names may be trademarks or 
registered trademarks, and are used only for identification and explanation 
without intent to infringe.

British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library

ISBN: 978-1-041-14845-6 (hbk)
ISBN: 978-1-041-14848-7 (pbk)
ISBN: 978-1-003-67642-3 (ebk)

DOI: 10.4324/9781003676423

Typeset in Times New Roman
by KnowledgeWorks Global Ltd.

First published 2025
by Routledge
4 Park Square, Milton Park, Abingdon, Oxon OX14 4RN

and by Routledge
605 Third Avenue, New York, NY 10158

Routledge is an imprint of the Taylor & Francis Group, an informa 
business

© 2025 selection and editorial matter, Jan-Noël Thon and Lukas R.A. Wilde; 
individual chapters, the contributors

https://www.taylorfrancis.com
https://doi.org/10.4324/9781003676423


List of Illustrations vi 

1 Introduction: AI Aesthetics 1 
JAN-NOËL THON AND LUKAS R.A. WILDE 

2 AI Horseplay: Postdigital Aesthetics in AI-Generated Images 22 
JAN-NOËL THON 

3 Aesthetic Protocols of Popular AI Art 59 
LOTTE PHILIPSEN 

4 The Aesthetics of Promise: Tech-Failures and 
Tech-Demonstrations of Generative AI 75 
OLGA MOSKATOVA 

5 Affective Realism: Reimagining Photography 
with the Google Pixel 9 92 
MICHELLE HENNING 

6 Aesthetics and Rhetorics of AI Anthropomorphization: 
The Eliza Effect vs. the Character Effect 106 
LUKAS R.A. WILDE 

Contributors 124 
Index 125 

Contents 


Figures 

2.1 AI-generated images of a line drawing, a crayon drawing, a 
watercolor painting, an oil painting, a stained-glass window, 
and a woven tapestry of a galloping horse (created with 
ChatGPT 4o/DALL·E 3 in August 2024). 31 

2.2 AI-generated images of a crayon drawing (with three-
dimensional crayons), a bronze sculpture, a wooden 
sculpture, a paper sculpture, an ice sculpture, and a cloud 
sculpture of a galloping horse (created with ChatGPT 4o/ 
DALL·E 3 in August 2024). 33 

2.3 AI-generated images of an old photograph of a galloping 
horse and old photographs of a line drawing, a crayon 
drawing, a watercolor painting, an oil painting, and a 
stained-glass window of a galloping horse (created with 
ChatGPT 4o/DALL·E 3 in August 2024). 35 

2.4 AI-generated images of old photographs of a woven 
tapestry, a bronze sculpture, a wooden sculpture, a paper 
sculpture, an ice sculpture, and a cloud sculpture of a 
galloping horse (created with ChatGPT 4o/DALL·E 3 in 
August 2024). 36 

2.5 AI-generated images of a “pixelated” line drawing, crayon 
drawing, watercolor painting, oil painting, stained-glass 
window, and woven tapestry of a galloping horse (created 
with ChatGPT 4o/DALL·E 3 in August 2024). 39 

2.6 AI-generated images of a “pixelated” crayon drawing (with 
three-dimensional crayons), bronze sculpture, wooden 
sculpture, paper sculpture, ice sculpture, and cloud sculpture 
of a galloping horse (created with ChatGPT 4o/DALL·E 3 
in August 2024). 41 

List of Illustrations 


List of Illustrations vii 

2.7 AI-generated images of a “glitched” line drawing, crayon 
drawing, watercolor painting, oil painting, stained-glass 
window, and woven tapestry of a galloping horse (created 
with ChatGPT 4o/DALL·E 3 in August 2024). 43 

2.8 AI-generated images of a “glitched” crayon drawing (with 
three-dimensional crayons), bronze sculpture, wooden 
sculpture, paper sculpture, ice sculpture, and cloud sculpture 
of a galloping horse (created with ChatGPT 4o/DALL·E 3 
in August 2024). 44 

3.1 Example of Le Brun’s diagrammatic drawings on how 
to visually depict a human feeling, here “Physical Pain” 
(Charles Le Brun: La Douleur corporelle et aiguë. Ink 
on paper, 19.7 × 24.4 cm. Paris, Musée du Louvre. 
https://collections.louvre.fr/en/ark:/53355/cl020206665). 67 

4.1 Aesthetics of transformation in an AI-generated ballet video 
(Werners AI Art 2024). 76 

4.2 A cinema aesthetics in the Luma Dream Machine tech-
demo video from September 3, 2024 (Luma AI 2024b). 82 

4.3 Aesthetics of plasmaticness in Sora's tech-demo video 
(OpenAI 2024d). 85 

Table 

3.1 Distinctions between professional AI art, festival AI art, and 
popular AI art. 64 

https://collections.louvre.fr/en/ark:/53355/cl020206665


https://taylorandfrancis.com


Introduction 
AI Aesthetics 

Jan-Noël Thon and Lukas R.A. Wilde 

At the time of this writing—in spring 2025—generally accessible generative 
AI platforms and, more specifically, AI image generators such as DALL·E, 
Midjourney, or Stable Diffusion have been broadly available for almost three 
years. AI-based image enhancement and modification have also been inte-
grated into many other applications such as the Adobe suite of image process-
ing programs or Google phones. New generative AI applications are launched 
or announced almost every week, most notably perhaps Google’s moving im-
age generator VEO2, a competitor to OpenAI’s Sora, and Janus-Pro-7B, the 
open-source multimodal AI model that is based on the Chinese AI startup 
platform DeepSeek. Generative AI is making rapid progress in other areas, as 
well—with the generation of music and songs, which have been widely dis-
cussed after the release of Suno AI in December 2023, being a particularly sa-
lient example (see, e.g., Johnson et al. 2023; Lin and Chen 2024; Nayar 2025). 
Since most of these technologies build on—and integrate—natural language 
comprehension through large language models (LLMs), they are essentially 
all multimodal “at heart,” even if that multimodality remains “invisible” to the 
users (see, e.g., Bajohr 2024b; Coeckelbergh and Gunkel 2025). While text-
to-image generators (such as DALL·E, Midjourney, or Stable Diffusion) and 
text-to-text generators (such as ChatGPT, Claude, or Gemini) were strictly 
separated at first (if only in terms of their output appearances), ChatGPT-3 
fundamentally changed AI image production in October 2023, with its in-
tegration of DALL·E 3 further foregrounding the multimodality of both the 
interface and the generated outputs. It is clear, then, that AI-generated outputs 
in various perceivable forms have swiftly become a salient element of our 
current media culture, instigating, for example, a hermeneutics of suspicion 
toward every new image or video now being potentially AI-generated or AI-
manipulated (see, e.g., Meyer 2024); “polluting” Google search results with 
unmarked “AI content” (see, e.g., Balkowitsch 2024); and substantially alter-
ing the value of image, videos, and music files—mostly to the disadvantage 
of human artists and producers on whose work the underlying LLMs draw as 
training data without the former’s knowledge or consent (see, e.g., Dornis and 
Stober 2024). 

1 

DOI: 10.4324/9781003676423-1
This chapter has been made available under a CC-BY-NC-ND 4.0 license.

https://doi.org/10.4324/9781003676423-1


2 Jan-Noël Thon and Lukas R.A. Wilde 

While there is a keen interest within media and cultural studies to come 
to terms with these new technologies and the diverse practices they afford, 
the rapid development of diffusion-based AI image generators, the more re-
cent autoregressive models (see, e.g. Robison 2025), and LLMs more broadly 
poses considerable challenges to traditional humanities approaches,1 not least 
because the breakneck speed of the AI development cycle clashes with ac-
ademic publication timelines: On the one hand, it may be disappointing to 
publish snapshots of supposedly current practices and technologies that are 
already historical at the time of publication. On the other hand, however, it 
is just as undesirable to merely speculate about an AI future that is occluded 
by marketing utopias and imagined techno-catastrophes (see also, e.g., Bareis 
and Katzenbach 2021; Romele 2024 on “AI imaginaries”). Then again, it is 
also worth highlighting the continuities as well as the differences between AI 
image generators and earlier image-making technologies (see, e.g., Somaini 
2023; Zylinska 2020). The perceived abandonment of an immediate indexical 
relationship to physical reality, for example, is hardly new for digital pic-
tures and has been controversially discussed during the emergence of digital 
photography and digital image editors such as Adobe Photoshop (see, e.g., 
Lehmuskallio et al. 2019; Mitchell 1992). Indeed, the partial autonomy of a 
“nonhuman apparatus” generating pictures “automatically” has already been 
noted during the emergence of nondigital photography (see, e.g., Chesher and 
Albarrán-Torres 2023). Likewise, questions surrounding the manipulative 
“covert” use of AI generated images in the context of “fake news” and “deep 
fakes” (see, e.g., Broinowski 2022) refer back to the much older discussions 
surrounding “visual evidence” within documentary studies and beyond (see, 
e.g., Nichols 1991; Schwartz 1992), which suggests that there is nothing cat-
egorially new in AI-generated images’ potential to mislead, misrepresent, and 
manipulate—even if the ease with which they can be used to do so certainly 
remains striking. Indeed, there is no simple heuristic for the (human) recog-
nition of AI-generated images anymore, since AI image generators can be 
prompted to create such images not only with a more or less specific repre-
sentational content that is often described as the “subject” of these images but 
also with a more or less specific aesthetic form that is often described in terms 
of their “style” (see, e.g., Meyer 2023). 

We thus propose to frame the “AI aesthetics” of AI image generators such 
as DALL·E, Midjourney, or Stable Diffusion as a specific kind of “media 
aesthetics,” aiming to connect media studies even more closely to critical AI 
studies (see, e.g., Lindgren 2024; Raley and Rhee 2023; Roberge and Castelle 
2021). Among other things, this implies a focus on current and developing 
machine learning platforms not merely as technology, narrowly understood, 
but as media (see, e.g., Bolter 2023; Wilde 2023). As Marx notes, “the ma-
terial component—technology narrowly conceived as a physical device—is 
merely one part of a complex social and institutional matrix” (1997, 979; 
original emphasis). Alternatively, we could also operate with an expanded 


Introduction 3 

conceptualization of “technology” here. Dhaliwal, for example, argues that 
“technology” is itself a “compound […] blurring economy, politics, and tech-
nics into one word” (2023, 311), and distinguishes between five different 
“objects of study” and related “research fields” that such an expanded concep-
tualization of “technology” gives rise to, namely “[m]achines and devices” 
(of interest to the sciences and engineering); “[c]ulture and [new media] 
art” (of interest to cultural studies and art history); “[p]eople and communi-
ties” (of interest to sociology and anthropology of technology); “[s]ystems 
and structures” (of interest to sociology and political economy); and “[t]ech-
niques, practices, and habits” (of interest to media archaeology and cultural 
technologies) (2023, 313). Again, then, we cannot appropriately think through 
“technology” without also acknowledging the complex social, cultural, and 
institutional contexts in which it is developed, distributed, and employed (see 
also, e.g., Pasquinelli 2023). 

In the context of the present volume, however, we will still need to nar-
row our focus from all sorts of machine learning technologies (such as auto-
mated driving, automated weapons, or facial image recognition) to what is 
called “generative AI,” conceptualizing the latter as media that may be used 
for communication and interaction (which at least the outputs they generate 
certainly are).2 Focusing more closely on the concept of media aesthetics, 
the “slightly jarring quality” that results from its “forcing together of mod-
ern and ancient concepts” (Mitchell 2013, 7) also requires some additional 
explication. Put in a nutshell, the use of the term “media aesthetics” first be-
came widespread in the late 1980s and early 1990s in reaction to the (at that 
point) “new media” and their implementation in installation art and sound art 
(see, e.g., the survey in Schröter 2019a). Historically, then, media aesthet-
ics initially addressed “a technologically and, above all, digitally saturated 
art; at the same time, its theoretical conception as a branch of media studies 
formulates a decidedly anti program to the classical disciplines of art his-
tory, musicology, and literary studies” (Mersch 2024, 205; our translation). 
From there, the term branched out into different humanities discourses, as, 
for example, Hausken (2013) or Mersch (2024) have reconstructed in more 
detail. In light of the by now many different approaches to the analysis (and 
within the field) of media aesthetics, we will begin by exploring how the 
two components of the compound (i.e., “media” and “aesthetics”) can be 
understood both very narrowly and very broadly, before we conclude by em-
phasizing the potential productivity of “middle-ground” conceptualizations 
of both terms. While the chapters collected within the present volume might 
privilege one starting point over another, the purpose of this introduction is 
merely to outline the range of possible approaches toward the perceivable 
properties of AI-generated output: We would thus like to illustrate and in-
terrelate, with specific examples taken from existing research from the last 
couple of years, how explicit or implicit differences in the conceptualization 
of both “media” and “aesthetics” can result in quite heterogeneous positions 


4 Jan-Noël Thon and Lukas R.A. Wilde 

regarding what should be taken as “given”—and what, in contrast, should be 
considered to be a “matter of concern” (Latour 2004, 232). 

Narrow and Broad Conceptualizations of (AI) “Media” 
and (AI) “Aesthetics” 

Let us begin with the first component of the compound “media aesthetics,” 
then, which can initially be specified by distinguishing between a narrow and 
a broad conceptualization of “media.” In the narrow sense, any “medium” 
may be understood functionally, as “a tool or instrumentum that emerges from 
an end–means relationship and imposes itself on the real, processes it, and in 
doing so ‘produces’ (poein) something else” (Mersch 2024, 214; our transla-
tion). Perhaps needless to say, this already entails vastly different approaches 
to media aesthetics, ranging from modernist theories of art to discourses of 
mass communication (see also Hausken 2013, 34). Yet, these different con-
ceptualizations nevertheless share a common point of departure, namely the 
notion that “media” are more or less determined entities (or materials or 
channels) for and between human as well as institutional actors (see, e.g., 
Elleström 2021). Media scholars may then try to assess the respective affor-
dances, limitations, and influences of this “in-betweenness,” be it positively 
(and often normatively) as a potential for artistic expression, or negatively 
(and often more descriptively) as the “distortion” of any assumed content or 
communicative intent within a sender–receiver model. Regardless of these 
(and many more) important differences, any narrow conceptualization of 
“media” would thus appear to start from given socio-cultural settings and “use 
cases,” trying to assess the (limiting or enabling) influences of the respective 
means of communication and interaction. In this view, AI image generators 
may appear as an alternative to other technologies of image production, and 
we might explore in which contexts, by which actors, for which means, and 
to which effects AI-generated images are employed in contrast to photogra-
phy or hand-drawn pictures (see, e.g., Wilde 2025); how they are distributed, 
contextualized, and discussed in the context of fan cultures, for example (see 
Lamerichs 2023). Within such an already determined setting, we could also 
find out that fearmongering AI-generated images of “foreigners” circulated by 
right-wing parties on social media channels can seamlessly “replace” earlier 
stock photography or racist hand-drawn pictures where they serve to instigate 
attitudes and affects (fear, hatred) toward their depicted content that makes 
the latter only relevant as a type (of people, for example) (see Lemmes 2025). 
Perceivable technological or more broadly formal differences (“aesthetics”) 
thus appear to be of only minor importance in some “use cases,” while they 
are much more relevant in others. 

In contrast, “media” in a broader sense are not already determined factors 
or elements within specific mediations, but “always already in play where 


Introduction 5 

culturality happens” (Mersch 2024, 215; our translation), which means that 
we need to consider “media” as inescapable elements of our making sense of 
the world. Within the anglophone tradition, Mitchell and Hansen (2010) have 
propagated this as an “ecological” approach to media studies, considering its 
object an “encompassing environment” (Hausken 2013, 42): 

[A]re [media] better pictured as themselves the situation, an environment 
in which human experience and (inter)action take place? Would it not be 
better to see media, rather than as the determining factor in a cause and 
effect scenario, as an ecosystem in which processes may or may not take 
place? 

(Mitchell 2013, 18) 

Mersch (2024, 215) proposes to use the term “dispositive” in order to cap-
ture this broad conceptualization, as “media” in this sense are seen as po-
sitioning human subjects within the world and, in doing so, as creating or 
shaping their subjectivity—not only through technological means, but also, 
and more fundamentally, through a “semiotic formatting” of culture and so-
ciety (see also already Manovich 2001, 69–93; as well as, e.g., Crano 2020; 
Jeong 2013). Our questions with regard to such “media” thus likewise become 
considerably broader, perhaps oriented toward changing notions of reality, 
knowledge, and society (as “imagined” communities [see Anderson 1991]) 
that are accessible only in a mediated fashion. 

Returning to the area of generative AI, we could thus ask, for example, 
how notions of the “real” are transformed through the increase of AI-gener-
ated outputs. This is brought into sharp relief in Kirschenbaum’s warning of 
an imminent “textpocalypse” (2023, n.pag.) during which most texts online 
are no longer created by humans with any discernible “communicative in-
tent,” but by AI-based chatbots. This has also become a major concern with 
regard to countless novels sold via Amazon or “bands” whose music is avail-
able through “regular” streaming platforms such as Spotify, despite being en-
tirely AI-generated (see, e.g., Al-Sibai 2024; Knibbs 2024). As noted above, 
it should also be seen as a problem when more and more Google searches 
present AI-generated images whose “content” differs vastly from reality with-
out any specific designation (see, e.g., Growcoot 2023); when social media 
posts (“found in the wild”) are likewise mistaken for representations of reality 
(see, e.g., Bond 2024); or when influencer or company profiles turn out to 
be wholly AI-generated (see, e.g., Medlicott 2023). We are thus interested 
in the impact of a media environment increasingly saturated by generative 
AI, though this impact clearly cannot be reduced to individual AI-generated 
outputs. Instead, such outputs collectively contribute to creating a new “me-
dia reality” to which people and institutions will have to react in one way or 
another—which will most likely also have an impact on the perception of 
outputs that are not (or not exclusively) AI-generated.3 


6 Jan-Noël Thon and Lukas R.A. Wilde 

Just as we can distinguish between broad and narrow conceptualizations of 
“media,” so could we start out from two similarly “radical” (if commonly pro-
posed) alternatives for conceptualizing the term “aesthetics” (which have also 
been previously discussed, in fairly similar terms, by Hausken [2013], Mersch 
[2024], and Schröter [2019a]).4 At first glance, then, the term “aesthetics” 
oscillates between a philosophy of art and a philosophy of perception. In a 
narrow (and often normative) conceptualization of “aesthetics-as-artistics,” 
the focus is on skill, judgment, and connoisseurship (see, e.g., Coeckelbergh 
2023; Manovich 2019). We might then ask whether or not, or to what degree, 
AI-generated or AI-augmented outputs have or can have artistic merit; who 
is the artist (or “author” [see, e.g., Bajohr 2024c; Barale 2024, 41–57]); what 
roles do the alleged intentions of any such actor (or their absence) play for 
any such assessment (see, e.g., Manovich and Arielli 2024; Moruzzi 2020); 
and which forms and practices of collaborative co-creation have “creative” 
potential (see, e.g., Feyersinger et al. 2023; Navas 2023). One particularly 
prominent concern here is how aesthetic judgment can be informed by po-
litical reasoning, for example, when AI imagery is generally disregarded as 
“slob” or as “inherently fascist” (see, e.g., Watkins 2025).5 

In a broader sense, however, the term “aesthetics” is also increasingly 
used to refer to a more general theory of perception or “aisthesis.” Related 
to media (in both the broad and the narrow sense sketched above), such an 
“aesthetics-as-aisthetics” aims “to understand the complexity of sense per-
ception and its embeddedness in the cultures and histories of technologies of 
mediation” (Hausken 2013, 30–31), and could thus perhaps also be described 
as a “phenomenological” approach to media aesthetics. Kirschenbaum, for 
example, speculated whether our recent AI-driven “algorithmic conditioning” 
may have created (or may yet create) a “fundamental untethering of language 
from conditions of lived reality […], the moment when we question even 
that which we know to be bodily, palpably true because our screens—and 
our friends on our screens—say otherwise” (2025, 11–12). While it remains 
to be seen how generative AI addresses, negates, or otherwise interacts with 
the human senses and with our embodied perception (or embodied cognition 
more generally), one important line of already existing research argues that 
AI-generated images (and perhaps also music) is mostly about the remixing 
of generic “styles” or “vibes” that reproduce conventional affects (see, e.g., 
Meyer 2023, 108). Following theorists such as Ahmed (2010), Biondi (2022), 
and Massumi (1995), we could then emphasize that “vibes […] make us feel 
a certain way. They have an energy that we like or don’t. We are surrounded 
by them. We are informed by them” (Biondi 2022; n.pag.; original emphases). 
An “AI aisthetics” could thus investigate the impact of algorithmically pro-
duced “vibes” as computable affects (see also Grietzer 2025). 

Both narrow conceptualizations and both broad conceptualizations (of 
“media” and “aesthetics,” respectively) we have sketched thus far also ap-
pear to be aligned with each other at least to some degree: An instrumental 


Introduction 7 

conceptualization of “media” as carriers/materials for meaning and “expres-
sive intent” lends itself to “artistic” considerations (especially within “for-
malist” approaches to modernist art6); a postinstrumental conceptualization 
of media as dispositives or environments has a certain attraction to phenom-
enological theories of perception and embodiment. While the various forays 
into the generative AI discussions touched upon above might already be-
come more productive when undertaken against the background of these four 
well established “radical” conceptualizations of “medium” and “aesthetics,” 
respectively, we would like to present in slightly more detail two “middle-
ground” conceptualizations of these terms that seem particularly relevant in 
a generative AI context. Being “middle-ground” conceptualizations, they can 
each be located somewhere in-between the respective narrow and broad con-
ceptualizations of “medium” and “aesthetics” that we have sketched thus far. 

“Middle-Ground” Conceptualizations of (AI) “Media” 
and (AI) “Aesthetics” 

How, then, could we conceptualize “media” and “mediality” as neither nar-
rowly instrumental (as a means, channel, or material within a defined use-
context), nor as (perhaps too) broadly postinstrumental (as a dispositive, an 
environment, or a “condition” providing affordances to engage with the world 
physically, cognitively, and affectively)? As a third option in-between these 
“radical” extremes, we could instead approach specific technologies as net-
works of human and nonhuman actors that are open to various “use cases” 
and representational affordances, perhaps shaping (i.e., enabling or limiting) 
certain uses over others, but doing so through their specific situatedness in 
all the “domains of technology” outlined by Dhaliwal (2023). Such an ap-
proach to media and their mediality thus focuses not only on specific net-
works of human and nonhuman actors but also on the distribution of agency 
between them, and on how his distribution shapes specific affordances for 
interaction, communication, and representation. Questions such as these have 
been discussed in terms of an actor-media-theory, modeled after the sociologi-
cal actor-network-theory (ANT), but with a specific focus on technologies of 
communication and interaction (see Wilde 2023; as well as, e.g., the contribu-
tions in Spöhrer and Ochsner 2017; Thielmann and Schüttpelz 2013). 

Within the theoretical framework of actor-media-theory, we would then 
consider AI technologies neither as mere (predetermined) instruments in a 
given use-case nor as (open and ubiquitous) dispositives of general(ized) 
media environments, but as specifically situated actor-media-networks. By 
following this approach, we can more effectively investigate how particu-
lar new and emerging technologies (hardware, software, and infrastructure), 
through their interfaces, serve as “midpoints” between the institutions behind 
them (companies, legal and economic frameworks, social roles with specific 


8 Jan-Noël Thon and Lukas R.A. Wilde 

hierarchies, etc.) and the outputs they generate. Conceptualizing media in this 
way thus helps us to acknowledge that, despite it being tempting to address 
AI-generated images as such, differences in models, versions, and platforms 
matter quite a bit. Much-discussed representational biases of LLMs, for ex-
ample (see, e.g., Bianchi et al. 2022; Hofmann et al. 2024; Katz 2025), emerge 
from a complex interaction between many different systems that are in prin-
ciple separate, even if we may not be able to see this in the resulting images, 
namely (a) training datasets (such as LAION-5B) with their existing image/ 
text-pairings, (b) pre-trained language models (such as CLIP) that assign de-
fault values to linguistic prompts (as tokens) to “understand” them through 
a high-dimensional vector within the latent space, and (c) the image models 
themselves (such as the Stable Diffusion models from “marketplaces” like 
CivitAI) that can be trained and “defaulted” differently even with recourse to 
the same dataset (see, e.g., Allamar 2022; Škripcová 2024; Song et al. 2024). 

While we cannot necessarily reconstruct these infrastructures in all cases 
based on disclosed datasets, and while we might moreover not be able to de-
termine any causal input–output relation in the sense of an “explainable AI” 
(see, e.g., Ali et al. 2023; Zylinska 2020, 75–85), we should be careful not to 
give in to the temptations of what Offert and Dhaliwal describe as a black box 
casuistry in the context of AI discourse: 

“AI models are black boxes,” in 2024, sounds like a truism, and could yet 
not be further from the truth. Yes, AI models are complex systems, and 
yes, there is no easy way to infer, purely from the weights and biases of a 
neural network, what the model does, or what data it was trained on. But 
AI models rarely consist of just a single neural network, nor do they come 
into the world as entirely new systems, trained on entirely new data, with 
entirely new mechanisms. AI models are historical, maybe even ‘more his-
torical’ than many other technical objects. Every new model builds on an 
entire architectural history, a history of how things are done with the parts 
that are available. 

(Offert and Dhaliwal 2024, 5) 

While we might, for example, not be able to “look into” some datasets and 
models (such as OpenAI’s), we do know quite a bit about others, as Buschek 
and Thorp (2023) have reconstructed in more detail with regard to Midjour-
ney and Stable Diffusion. Both of the latter draw on the LAION-5B dataset 
of 5.85 billion CLIP-filtered image-text pairs, made available by researchers 
in 2022 (see Schuhmann et al. 2022) with the warning that they “do not 
recommend using it for creating ready-to-go industrial products” (Beaumont 
2023, n.pag.). However, as Buschek and Thorp (2023) explain, LAION-5B 
was itself built from an even larger dataset (containing data from over three 
billion websites) by another nonprofit organization (Common Crawl). Some 
commercial domains (such as Pinterest, Shopify, and SlidePlayer) were 


Introduction 9 

highly overrepresented in LAION, because they host many image-text pair-
ings. Midjourney and Stable Diffusion, however, draw only on a subset of 
the LAION-5B foundation dataset called “LAION-Aesthetics” (consisting of 
roughly 15,000 images). This, in turn, was once more created using algorith-
mic filtering to select only images from the foundation set that were rated to 
be particularly “visually appealing,” according to parameters provided earlier 
by users of the Discord communities for GLIDE and Stable Diffusion. These 
users ranked and rated 238,000 (other) AI-generated images from yet another 
training set called “Simulacra Aesthetic Captions (SAC).” What this example 
shows is that, despite the appeal of black box casuistry within AI discourse, 
we know quite a bit about the “aesthetics” that any image in Midjourney or 
Stable Diffusion will “gravitate toward,” because we can trace them back to 
only “a handful of [very active] users” whose “aesthetic preferences dominate 
the dataset” (Buschek and Thorp 2023, n.pag.). 

Having located our conceptualization of actor-media-networks in be-
tween instrumental and postinstrumental conceptualizations of “media,” we 
would similarly like to offer a conceptualization of “aesthetics” as neither an 
“artistics” that is primarily concerned with aesthetic judgments (related to skill 
and connoisseurship), nor as an “aisthetics” that conflates aesthetic perception 
with perception (or aisthesis) in toto (see also Thon 2025). Drawing on Martin 
Seel’s influential proposal to distinguish aesthetic from nonaesthetic percep-
tion via the former’s “self-referentiality” or “sensing self-awareness” that ties 
“[t]he special presence of the object of perception […] to a special presence 
of the exercise of this perception” (Seel 2005, 31; original emphases), we can 
instead conceptualize aesthetics as being concerned not with perception (or 
aisthesis) in general, but rather with a specific kind of perception (i.e., aes-
thetic perception).7 While there is no one-to-one relation between this kind of 
“self-referential” aesthetic perception and the more or less “self-referential” 
form of aesthetic artifacts or objects, broadly conceived, we would further 
suggest that AI-generated outputs that foreground, to varying degrees and de-
pendent on context and use, their “formatting” or “style” as opposed to their 
“content” or “subject” could be described as following a logic of (opaque) 
hypermediacy as opposed to a logic of (transparent) immediacy sensu Bolter 
and Grusin (1999). Such AI-generated outputs might then be more interesting 
from the perspective of a “middle ground” AI aesthetics than those AI-gener-
ated outputs that do not foreground their “formatting” or “style.”8 

If, hypothetically, we prompted ChatGPT o1 to briefly explain how the 
term “AI aesthetics” could be understood, the text we would receive after 
it “thought about it for a second” might well appear to be largely transpar-
ent to us within what could be described as standard “use cases” for such an 
explanation. Within such standard “use cases,” we might focus on assess-
ing the propositions, concepts, or pieces of information “contained” in the 
text, allowing us to abstract to a certain degree from the form of the spe-
cific AI-generated output—potentially even across specific languages such as 


10 Jan-Noël Thon and Lukas R.A. Wilde 

English or German (for abstractions as a set of medial operations and material 
practices, see Schröter 2019b). The AI-generated output would thus become 
transparent to a certain degree, relative to a given “use case” or a “medial 
operation,” in that it would “not seem to change at least with some changes 
in the materiality” (Schröter 2019b, 26). Similar observations apply to AI-
generated images: The infamous AI-generated “baby peacock,” which does 
not represent anything looking like an actual specimen of this genus, but takes 
the form of a kind of fictional “Pokémon” in which the appearance of an adult 
male peacock has been merged with pronounced attributes associated with 
the quality of “cuteness” (see Larsen 2023), is not discussed as a problem 
because of “stylistic” allusions to a photographic representation, but because 
of its abstractable features which would not even serve its purpose as an ad-
equate illustration—in any perceivable image style. To the extent that “we are 
interested in the information the image, and the image in combination with 
the text, gives us” (Schröter 2019b, 28), we can thus once again abstract from 
the form, “formatting,” or “style” of the image and toward its potential to il-
lustrate how any “real” baby peacock generally looks like—and how any baby 
peacock picture that affords such an operation should look like.9 

Transparency and abstraction will always remain matters of degree (see 
Schröter 2019b, 32), but degree here does not imply indifference. As a con-
trasting example of how much more foregrounded the form, “formatting,” or 
“style” of AI-generated outputs may be (in other words, how much less trans-
parent and abstractable the AI-generated outputs in question may appear), we 
could (again, hypothetically) instruct ChatGPT o1 to generate an explanation 
of AI aesthetics in the form of a haiku or a 3-panel-comic strip. The results of 
such prompts are likely to be quite opaque to the degree that they will fore-
ground or, indeed, imitate the form of “other media” such as a specific type of 
poetry (with 5-7-5 syllables and a comparison to nature) or a script detailing 
the (absent) content of sequential images and speech bubbles. When we want 
to assess the degree of self-referentiality, opacity, or hypermediacy of an AI-
generated output relative to medial practices, “use cases,” and the degree to 
which they allow to abstract from the perceivable formatting of the output, 
then the question of how “transparent” any given output is remains relative to 
conventions—perhaps cultural “protocols”10—of media use. 

In discussions within social media comment sections, for example, remarks 
such as “this article feels like it was at least partially AI written […]. That is 
exactly the type of it-literally-doesn’t-mean-anything filler that LLMs love 
to insert into text” (DeedleFake 2025, n.pag.) have become quite frequent. 
They retroactively add a hypermediacy-oriented, opaque, self-referential 
perspective to our initially transparent hypothetical example above. Not only 
does the “default style” for AI-generated images—that is, the “style” em-
ployed without any specific “style prompt”—change considerably between 
platforms and models, but the sociocultural conventions of what counts as a 
“transparent” text or image (and which could, thus, perhaps be perceived as 


Introduction 11 

comparatively “non-aesthetic”) do as well. Indeed, “[f]or these models, the 
‘photographic’ seems to be just another ‘style’, an aesthetic, a certain ‘look’, 
not a privileged mode of indexical access to the world” (Meyer 2023, 108). 
What could be described as a “photographic aesthetics” or a “photographic 
form” is generally perceived as more transparent than drawings in contem-
porary media culture,11 but this is less some inherent technological property 
of photochemical trace-recordings than it is the result of the dominance of 
images that “look” photographic in many medial contexts (even though they 
also might be CGI, photoshopped, and/or AI-generated). However, their per-
ceivable medial forms (that are often not foregrounded and thus comparably 
transparent) have accumulated and inherited photography’s “protocols” that 
make them abstractable toward what they seem to represent, “even if the read-
ing of that form as natural is culturally conditioned” (Wasielewski 2024, 15; 
see also Hausken 2024). Drawing a distinction between form, “formatting,” or 
“style,” on the one hand, and representational content, on the other, by focus-
ing on “use cases” relative to conventionalized media practices also avoids 
the problem of having to depart from any projected “meaning” within AI-
generated outputs (in contrast to their form), which current models arguably 
have no understanding of (see Bender et al. 2021). 

Conclusion(s) 

In offering a survey of different (sometimes explicit, more often implicit) 
conceptualizations of “AI aesthetics” that underly existing research on 
AI-generated outputs, we have tried to show that how we conceptualize 
both (AI) “media” and (AI) “aesthetics” will saliently inform our methodo-
logical stance by allowing us to draw different distinctions between what 
we (more or less readily) assume as “given”—and what, in contrast, we 
consider a “matter of concern” (Latour 2004, 232). The “middle-ground” 
conceptualization of “media” as actor-media-networks that we propose as 
a potential alternative to narrowly instrumental or broadly postinstrumental 
conceptualizations takes its starting point neither from a given “use case” 
nor from an assumed AI-saturated media environment, but from the af-
fordances of specific technologies, platforms, and models—their “default 
configurations” that are nevertheless open to countless diverging uses. The 
“middle-ground” conceptualization of aesthetics as concerned with self-
referential aesthetic perception that we consider as a potential alternative to 
artistics-oriented and aisthetics-oriented conceptualizations likewise takes 
as its starting point specific conventions and practices of media use, while 
contrasting those where the “protocols” and “use cases” are more embed-
ded in “artistic” practices (which usually do foreground their perceivable 
medial forms) to those that are more closely connected to instrumental 
practices (which often afford a higher degree of abstraction toward some 


12 Jan-Noël Thon and Lukas R.A. Wilde 

information, proposition, or other representational content, including an 
allegedly represented reality). Whether such protocols can remain stable 
when certain altermedial “formattings” or “styles” are imitated through 
generative AI remains a question that needs to be investigated for specific 
technological and usage contexts. 

With this in mind, we would like to conclude by tentatively proposing, 
again, that the area of “AI aesthetics”—within the framework of media aes-
thetics and, more specifically, with regard to AI-generated or AI-augmented 
outputs—can be accessed from at least six different directions, with the under-
lying conceptualization of “AI aesthetics” arguably also suggesting a privi-
leging of particular methodological stances (or ways of inquiry) over others 
when investigating the perceivable (aisthetic or indeed aesthetic) properties 
of AI-generated outputs: 

1 Instrumental (AI) media: This conceptualization may prioritize starting 
out from a given “use case” of communication and interaction and then in-
vestigating the perceivable properties of AI-generated outputs that enable, 
distort, or facilitate the respective processes of mediation. 

2 Actor-(AI) media-networks: This conceptualization may prioritize start-
ing out from a given technology, in all its complex and multidimen-
sional situatedness, and then investigating how its perceivable output 
affordances and defaults are related to the (“invisible”) materiality, 
infrastructures, and socio-cultural institutions that afford it—and vice 
versa. 

3 (AI) media dispositives: This conceptualization may prioritize starting out 
from a given (increasingly) AI-saturated media environment and then in-
vestigating its ramifications on society, culture, politics, and the perceiv-
able properties of all media forms situated therein. 

4 Artistic (AI) media: This conceptualization may prioritize starting out 
from given aesthetic judgments that are connected to notions of skill and 
connoisseurship (including discourses around creativity, originality, and 
politics) and then investigating to what degree and under which assump-
tions AI-generated outputs are appreciated or dismissed. 

5 Self-referential (AI) media: This conceptualization may prioritize start-
ing out from different media “use cases” and practices and then inves-
tigating to what degree and through which means AI-generated outputs 
highlight aspects of their perceivable form, “formatting,” or “style” and 
thus invite self-referential aesthetic perception rather than encouraging 
abstraction. 

6 Aisthetic (AI) media: This conceptualization may prioritize starting out 
from any type of situated interaction between humans and AI-generated 
or AI-augmented outputs (or, indeed, the interfaces of generative AI plat-
forms more broadly) and then investigating how sense perception, embod-
ied experiences, and affects are addressed, negated, or modulated therein. 


Introduction 13 

The present volume aims to represent all of these concerns as it includes 
chapters that move within and across the six conceptualizations of “AI aes-
thetics” presented here in various ways. It thus reflects not only on the theo-
retical but also on the methodological implications of AI aesthetics. At the 
same time, however, it demonstrates that this is still very much an emerging 
research field and that no dominant conceptualization of “AI aesthetics” has 
yet emerged. 

Notes 
1 As a case in point, AI image generators are perhaps primarily remarkable in terms 

of the quantity and speed with which they generate images. The deluge of AI-gen-
erated images might then appear too arbitrary and ephemeral to deserve sustained 
individual attention or in-depth analysis at first glance, perhaps contributing to a 
privileging of more quantitative and social science–oriented methods within the 
field of critical AI studies. It is worth noting, however, that within the specifically 
humanities-oriented methodological context of what Bajohr describes as “promp-
tology” (2023, 67), natural language commands can also be used to probe the “latent 
space” of AI image generators, with individual AI-generated images then becom-
ing “readable” as representations of an “underlying” cultural or sociotechnological 
imaginary (see, e.g., Ervik 2023; Offert 2023; Salvaggio 2023). 

2 Broadly speaking, the mediality of generative AI platforms manifests itself in the 
form of a more or less specific communicative “frontend” or interface that mediates 
between the social-institutional “systems and structures” as well as the “machines 
and devices” (hardware and software), on the one hand, and perceiving users 
(humans), on the other hand (see, e.g., Hookway 2014; Wirth 2016; 2023). These 
interfaces, in turn, allow for the production of the outputs that AI platforms were 
trained to generate in various semiotic modes such as written texts, images, or sounds 
(see Bateman et al. 2017; Forceville 2021; Kress 2023). 

3 Bajohr (2024a), for example, suggests that we might soon enter an age of “postarti-
ficial texts,” in which authors will always be under suspicion to have used LLMs for 
their writing, even and perhaps especially when they categorically claim to abstain 
from such practices, so that, perhaps, this very distinction will lose its significance 
(see also Köbis and Mossink 2021). Among other things, one could then assume that 
this will most likely also be reflected in the prevalence of different kinds of writing 
styles or textual aesthetics (including, for example, a greater emphasis on autofic-
tion or a less “probable” or “typical” diction), regardless of whether generative AI 
was in play or not—or whether we will ever know if it was with certainty. 

4 Schröter’s distinction between a “strong” conceptualization of “media aesthetics as 
‘aisthesis’” (Schröter 2019a, n.pag.) and a “weak” conceptualization of media aes-
thetics connected to “a specific use of the medium for the purpose of aesthetic per-
ception” (Schröter 2019a, n.pag.) is particularly relevant here, not least because he 
also emphasizes the need to explore a “middle ground” between these two extremes. 
That said, while Schröter identifies Seel as a key proponent of this “weak” concep-
tualization of media aesthetics, we would perhaps locate Seel’s (2005) approach 
closer to a “middle-ground” and would, in any case, not follow Schröter’s argument 
that a “medium kind of media aesthetics” should be (exclusively) “concerned with 
an aesthetics, even aisthetics, of pre-digital media, which become visible (and audi-
ble) once more through their transposed digital repetition” (Schröter 2019a, n.pag.). 
See also Thon (2025) for a more detailed discussion of Schröter (2019a) vis a vis 
Seel (2005). 


14 Jan-Noël Thon and Lukas R.A. Wilde 

5 Apart from the racist, sexist, and other biases that can still often be observed in the 
content as well as the form of AI-generated images, important concerns include 
that the production of AI content is hurting (creative) workers, devours millions of 
gallons of water, and releases thousands of tons of CO2 into the atmosphere annu-
ally (see, e.g., Crawford 2021; Coeckelbergh 2022). It also seems undeniable that 
AI-generated images have become particularly popular with right-wing parties and 
politicians around the globe during the past one and a half years—from Donald 
Trump over Britain First to the German AfD party (see, again, Watkins 2025)—and 
that there are clear structural alignments between AI technologies and what could 
be described as a neofascist re-ordering of governments (see, e.g., Kirschenbaum 
2025; McQuillan 2022; Salvaggio 2025). 

6 While discussions around formalism in aesthetics have often focused on (modernist) 
painting, there are many theoretically sophisticated proposals to be found here (see, 
e.g., Curtin on “pure” and “mixed formalism” [1982, 321], Wollheim’s distinction 
between “Normative Formalism,” “Analytic Formalism,” “Manifest Formalism,” 
and “Latent Formalism” [2001, 127], Zangwill’s defense of a “moderate formal-
ism” [2001, 55], Thomson-Jones discussion of the resurgence of “[s]ophisticated 
formalism” [2005, 375], and Nanay’s argument for what he calls “semi-formalism” 
[2016, 97]). There is also a broader “formalist” discourse in literary, cultural, and 
media studies often particularly interested in Shklovsky’s (2012) concept of os-
tranenie (or “making strange”). See also, once again, Thon (2025) for a more de-
tailed reconstruction. 

7 Other accounts of aesthetic as opposed to nonaesthetic perception are certainly 
available (see, e.g., Nanay’s account of “aesthetic attention as distributed attention” 
[2016, 26]), but Seel’s conceptualization of the former as a “sensing self-aware-
ness” (2005, 31) seems particularly productive for our present purposes. Against 
the background of Schröter’s critique of what he perceives as Seel’s focus on “a 
specific use of the medium for the purpose of aesthetic perception” (Schröter 2019a, 
n.pag.), however, it is worth stressing that Seel emphasizes that “this sensing [self-
awareness] has not yet anything to do with a reflexive self-referentiality, although 
this is often the case here too, especially in the context of art” (2005, 31; original 
emphasis). See also, once more, the detailed discussion in Thon (2025). 

8 Bolter and Grusin (1999) not only argue, following McLuhan (1964), that so-called 
new media remediate the “content” and “form” of older media in various ways, 
but also postulate a “double logic of remediation” (Bolter and Grusin 1999, 31), 
which among other things allows us to locate concrete AI-generated outputs be-
tween the poles of transparent “immediacy” and opaque “hypermediacy.” While 
the term “immediacy” broadly refers to the deemphasizing of the form, “format-
ting,” or “style” of a representation compared to its representational content that 
“either […] erase[s] or […] render[s] automatic the act of representation” (Bolter 
and Grusin 1999, 33), the term “hypermediacy” refers to representations that fore-
ground “acts of representation and mak[e] them visible,” “multipl[y] the signs of 
mediation” (Bolter and Grusin 1999, 34), and thus draw our attention to their form, 
“formatting,” or “style.” Yet again, see Thon (2025) for a more detailed discussion 
and an argument that representations following the “logic of hypermediacy” more 
strongly than the “logic of immediacy” may more readily instigate aesthetic as op-
posed to “merely” nonaesthetic processes of perception in their recipients. 

9 The idea that the communicative function of pictures could be described in similar 
ways as linguistic predicates has been discussed controversially in picture theory 
(see, e.g., Wilde 2021). Since pictorial signs communicate, by necessity (at least 
to some degree), the visual appearance(s) of the depicted objects or scenes, some 
considered “predication” (“to illustrate,” “to visualize,” or “to exemplify”) as 
the core of pictoriality (see, e.g., Novitz 1977; Sachs-Hombach 2003, 185–187). 


Introduction 15 

Others, in contrast, objected that seeing a “picture-elephant” was very different 
from seeing a set of predicates such as “has a long trunk” or “is an animal” (see, 
e.g., Abel 2004, 361–369; Elkins 1998, 3–46). It should be uncontroversial, how-
ever, that “predication” is a frequently employed (although, depending on termi-
nological specification, perhaps not necessary) communicative function of pictures 
(see, e.g., Krebs 2015). 

10 See Gitelman 2006 on the role of “protocols” in a historically oriented conceptu-
alization of “media.” Galloway, too, suggests that the term “protocol” may refer to 
any kind of “correct or proper behavior within a specific system of conventions” 
(2004, 7), which a medium arguably becomes once it is culturally established and 
widespread enough. Cavell (1971, 101–108) similarly speaks of “automatisms” that 
every medium accumulates and stabilizes, and which, just like “protocols,” can be 
technologically implemented or supported, but can also remain on the level of cul-
tural conventions (see also Rodowick 2007, 41–46). They thus entail not only the 
typical uses of (certain) media products but also the established routines of produc-
tion, distribution, and reception. 

11 There are, of course, long-standing discussions around the supposed transparency 
of photographic (and other) pictures, which are also closely connected to complex 
questions around “(photo)realism.” Walton has offered a particularly influential 
account of the former when he argues that “photography is indeed special, and 
that it deserves to be called a supremely realistic medium,” but is so and does 
so because “[p]hotographs are transparent” in that “[w]e see the world through 
them” (1984, 251, original emphases). Yet, while AI-generated images may still 
employ “photorealism” in the sense of an “aesthetic term that denotes a visual 
style,” and thus “mimic photographs without being photographs” (Hausken 2024, 
2), it seems clear enough that we do not “see the world through them” (Walton 
1984, 251, original emphasis), at least not in any intuitively plausible sense of this 
phrase. 

Works Cited 
Abel, Günter. 2004. Zeichen der Wirklichkeit. Frankfurt am Main: Suhrkamp. 
Ahmed, Sara. 2010. The Promise of Happiness. Durham: Duke University Press. 
Ali, Sajid, Tamer Abuhmed, Shaker El-Sappagh, et al. 2023. “Explainable Artificial 

Intelligence (XAI): What We Know and What Is Left to Attain Trustworthy Artificial 
Intelligence.” Information Fusion 99: 1–52. 

Allamar, Jay. 2022. “The Illustrated Stable Diffusion.” jalammar.github.io, October 4, 
2022. https://jalammar.github.io/illustrated-stable-diffusion. 

Al-Sibai, Noor. 2024. “Man Arrested for Creating Fake Bands with AI, Then Making 
$10 Million by Listening to Their Songs with Bots.” Futurism, June 9, 2024. https:// 
futurism.com/man-arrested-fake-bands-streams-ai. 

Anderson, Benedict R. 1991. Imagined Communities: Reflections on the Origin and 
Spread of Nationalism. 2nd ed. London: Verso. 

Bajohr, Hannes. 2023. “Dumb Meaning: Machine Learning and Artificial Semantics.” 
IMAGE: The Interdisciplinary Journal of Image Sciences 37 (1): 58–70. 

Bajohr, Hannes. 2024a. “On Artificial and Post-Artificial Texts: Machine Learning and 
the Reader’s Expectations of Literary and Non-Literary Writing.” Poetics Today 
45 (2): 331–361. 

Bajohr, Hannes. 2024b. “Operative Ekphrasis: The Collapse of the Text/Image Distinc-
tion in Multimodal AI.” Word & Image 40 (2): 77–90. 

https://jalammar.github.io/illustrated-stable-diffusion
https://futurism.com/man-arrested-fake-bands-streams-ai
https://futurism.com/man-arrested-fake-bands-streams-ai


16 Jan-Noël Thon and Lukas R.A. Wilde 

Bajohr, Hannes. 2024c. “Writing at a Distance: Notes on Authorship and Artificial Intel-
ligence.” German Studies Review 47 (2): 315–337. 

Bajohr, Hannes, and Moritz Hiller (eds.). 2024. Das Subjekt des Schreibens: Über 
Große Sprachmodelle. Special issue TEXT+KRITIK: Zeitschrift für Literatur 
X/24. 

Balkowitsch, Shane. 2024. “AI Is Corrupting the Internet as We Know It.” PetaPixel, 
April 25, 2024. https://petapixel.com/2024/04/25/ai-is-corrupting-the-internet-as-we- 
know-it/. 

Barale, Alice. 2024. The Art of Artificial Intelligence: Philosophical Keywords. Cam-
bridge: Cambridge Scholars Publishing. 

Bareis, Jascha, and Christian Katzenbach. 2021. “Talking AI into Being: The Narratives 
and Imaginaries of National AI Strategies and Their Performative Politics.” Science, 
Technology, & Human Values 47 (5): 855–881. 

Bateman, John, Janina Wildfeuer, and Tuomo Hiippala (eds.). 2017. Multimodality: 
Foundations, Research and Analysis: A Problem-Oriented Introduction. Berlin: De 
Gruyter. 

Beaumont, Romain. 2023. “LAION-5B: A New Era of Open Large-Scale Multi-Modal 
Datasets.” Laion.ai, March 31, 2023. https://laion.ai/blog/laion-5b/. 

Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitch-
ell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too 
Big?” FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Account-
ability, and Transparency, 610–623. 

Bianchi, Federico, Pratyusha Kalluri, Esin Durmus, et al. 2022. “Easily Accessible 
Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale.” 
arXiv:2211.03759, November 7, 2022. https://doi.org/10.48550/arXiv.2211.03759. 

Biondi, Zachary [Mac]. 2022. “The Philosophy of Vibes.” The Vim Blog, November 7, 
2022. https://thevimblog.com/2022/07/11/the-philosophy-of-vibes. 

Bolter, Jay David. 2023. “AI Generative Art as Algorithmic Remediation.” IMAGE: 
The Interdisciplinary Journal of Image Sciences 37 (1): 195–207. 

Bolter, Jay David, and Richard Grusin. 1999. Remediation: Understanding New Media. 
Cambridge, MA: MIT Press. 

Bond, Shannon. 2024. “AI-Generated Spam Is Starting to Fill Social Media: Here’s Why.” 
NPR, May 14, 2024. https://www.npr.org/2024/05/14/1251072726/ai-spam-images- 
facebook-linkedin-threads-meta. 

Broinowski, Anna. 2022. “Deepfake Nightmares, Synthetic Dreams: A Review of Dys-
topian and Utopian Discourses around Deepfakes, and Why the Collapse of Reality 
May Not Be Imminent—Yet.” Journal of Asia-Pacific Pop Culture 7 (1): 109–113. 

Buschek, Christo, and Jer Thorp. 2023. “Models All the Way Down.” Knowing Ma-
chines, April 9, 2023. https://knowingmachines.org/models-all-the-way. 

Cavell, Stanley. 1971. The World Viewed: Reflections on the Ontology of Film. Cam-
bridge, MA: Harvard University Press. 

Chesher, Chris, and César Albarrán-Torres. 2023. “The Emergence of Autolography: 
The ‘Magical’ Invocation of Images from Text through AI.” Media International 
Australia 189 (1): 57–73. 

Coeckelbergh, Mark, 2022. The Political Philosophy of AI. Cambridge: Polity Press. 
Coeckelbergh, Mark. 2023. “The Work of Art in the Age of AI Image Generation: Aes-

thetics and Human-Technology Relations as Process and Performance.” Journal of 
Human Technology Relations 1 (1): 1–13. 

https://petapixel.com/2024/04/25/ai-is-corrupting-the-internet-as-we-know-it/
https://petapixel.com/2024/04/25/ai-is-corrupting-the-internet-as-we-know-it/
https://laion.ai/blog/laion-5b/
https://doi.org/10.48550/arXiv.2211.03759
https://thevimblog.com/2022/07/11/the-philosophy-of-vibes
https://www.npr.org/2024/05/14/1251072726/ai-spam-images-facebook-linkedin-threads-meta
https://www.npr.org/2024/05/14/1251072726/ai-spam-images-facebook-linkedin-threads-meta
https://knowingmachines.org/models-all-the-way
https://Laion.ai


Introduction 17 

Coeckelbergh, Mark, and David J. Gunkel. 2025. Communicative AI: A Critical Intro-
duction to Large Language Models. Cambridge: Polity Press. 

Crano, Ricky. 2020. “Dispositif.” Oxford Research Encyclopedias, August 27, 2020. 
https://doi.org/10.1093/acrefore/9780190201098.013.1026. 

Crawford, Kate. 2021. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial 
Intelligence. New Haven: Yale University Press. 

Curtin, Deane W. 1982. “Varieties of Aesthetic Formalism.” The Journal of Aesthetics 
and Art Criticism 40 (3): 315–326. 

DeedleFake. 2025. “Even from This Tiny Snippet …” X, July 6, 2025. https://x.com/ 
DeedleFake/status/1941860561563558095. 

Dhaliwal, Ranjodh Singh. 2023. “What Do We Critique When We Critique Technol-
ogy?” American Literature 95 (2): 305–319. 

Dornis, Tim W., and Sebastian Stober. 2024. “Copyright Law and Generative AI Train-
ing—Technological and Legal Foundations.” NOMOS Open Access Books, August 
29, 2024. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4946214. 

Elkins, James. 1998. On Pictures and the Words That Fail Them. Cambridge: Cam-
bridge University Press. 

Elleström, Lars. 2021. “The Modalities of Media II: An Expanded Model for Under-
standing Intermedial Relations.” In Beyond Media Borders, vol. I: Intermedial Rela-
tions among Multimodal Media, edited by Lars Elleström, 3–91. Cham: Springer. 

Ervik, Andreas. 2023. “Generative AI and the Collective Imaginary: The Technology-
Guided Social Imagination in AI-Imagenesis.” IMAGE: The Interdisciplinary Jour-
nal of Image Sciences 37 (1): 42–57. 

Feyersinger, Erwin, Lukas Kohmann, and Michael Pelzer. 2023. “Fuzzy Ingenuity: 
Creative Potentials and Mechanics of Fuzziness in Processes of Image Creation with 
Text-to-Image Generators.” IMAGE: The Interdisciplinary Journal of Image Sci-
ences 37 (1): 135–149. 

Forceville, Charles. 2021. “Multimodality,” In The Routledge Handbook of Cognitive 
Linguistics, edited by Xu Wen and John R. Taylor, 676–687. New York: Routledge. 

Galloway, Alexander. 2004. Protocol: How Control Exists after Decentralization. Cam-
bridge, MA: MIT Press. 

Gitelman, Lisa. 2006. Always Already New: Media History and the Data of Culture. 
Cambridge, MA: MIT Press. 

Grietzer, Peli. 2025. “A Theory of Vibe.” In Thinking with AI: Machine Learning the 
Humanities, edited by Hannes Bajohr, 20–32. London: Open Humanities Press. 

Growcoot, Matt. 2023. “AI Image of Tiananmen Square’s Tank Man Rises to the Top of 
Google Search.” PetaPixel, September 27, 2023. https://petapixel.com/2023/09/27/ 
ai-image-of-tiananmen-squares-tank-man-rises-to-the-top-of-google-search. 

Hausken, Liv. 2013. “Introduction.” In Thinking Media Aesthetics: Media Studies, Film 
Studies and the Arts, edited by Liv Hausken, 29–50. Berlin: Peter Lang. 

Hausken, Liv. 2024. “Photorealism versus Photography: AI-Generated Depiction in the 
Age of Visual Disinformation.” Journal of Aesthetics and Culture 16 (1): 1–13. 

Hofmann, Valentin, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King. 2024. “AI 
Generates Covertly Racist Decisions about People Based on Their Dialect.” Nature 
633: 147–154. 

Hookway, Branden. 2014. Interface. Cambridge, MA: MIT Press. 
Jeong, Seong-hoon. 2013. Cinematic Interfaces: Film Theory after New Media. New 

York: Routledge. 

https://doi.org/10.1093/acrefore/9780190201098.013.1026
https://x.com/DeedleFake/status/1941860561563558095
https://x.com/DeedleFake/status/1941860561563558095
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4946214
https://petapixel.com/2023/09/27/ai-image-of-tiananmen-squares-tank-man-rises-to-the-top-of-google-search
https://petapixel.com/2023/09/27/ai-image-of-tiananmen-squares-tank-man-rises-to-the-top-of-google-search


18 Jan-Noël Thon and Lukas R.A. Wilde 

Johnson, Colin, Nereida Rodríguez-Fernández, and Sérgio M. Rebelo (eds.). 2023. Ar-
tificial Intelligence in Music, Sound, Art and Design. Cham: Springer. 

Katz, Yarden. 2025. Artificial Whiteness: Politics and Ideology in Artificial Intelli-
gence. New York: Columbia University Press. 

Kirschenbaum, Matthew. 2023. “Prepare for the Textpocalypse.” The Atlantic, March 8, 
2023. https://www.theatlantic.com/technology/archive/2023/03/ai-chatgpt-writing- 
language-models/673318/. 

Knibbs, Kate. 2024. “Scammy AI-Generated Book Rewrites Are Flooding Amazon.” 
Wired, January 18, 2024. https://www.wired.com/story/scammy-ai-generated-books- 
flooding-amazon. 

Köbis, Nils, and Luca D. Mossink. 2021. “Artificial Intelligence versus Maya Ange-
lou: Experimental Evidence That People Cannot Differentiate AI-Generated from 
Human-Written Poetry.” Computers in Human Behavior 114: 106553: n.pag. 

Krebs, Jakob. 2015. “Visual, Pictorial, and Information Literacy.” IMAGE: The Inter-
disciplinary Journal of Image Sciences 22: 7–25. 

Kress, Gunther. 2023. “Multimodal Discourse Analysis.” In The Routledge Handbook 
of Discourse Analysis, edited by Michael Handford and James Paul Gee, 35–50. New 
York: Routledge. 

Lamerichs, Nicolle. 2023. “Generative AI and the Next Stage of Fan Art.” IMAGE: The 
Interdisciplinary Journal of Image Sciences 37 (1): 150–164 

Larsen, Luke. 2023. “Fake AI Images Are Showing Up in Google Search—and It’s 
a Problem.” Digitaltrends, November 28, 2023. https://www.digitaltrends.com/ 
computing/fake-ai-images-showing-in-google-search. 

Latour, Bruno. 2004. “Why Has Critique Run Out of Steam? From Matters of Fact to 
Matters of Concern.” Critical Inquiry 30: 225–248. 

Lehmuskallio, Asko, Jukka Häkkinen, and Janne Seppänen. 2019. “Photorealistic 
Computer-Generated Images Are Difficult to Distinguish from Digital Photographs: 
A Case Study with Professional Photographers and Photo-Editors.” Visual Commu-
nication 18 (4): 427–451. 

Lemmes, Marcel. 2025. “Beyond the Hyperreal: Digitale Bilder und KI – Eine Heraus-
forderung für die Bildsemiotik?” In Bilder im Aufbruch: Herausforderungen der 
Bildwissenschaft, edited by Marcel Lemmes, Stephan Packard, and Klaus Sachs-
Hombach, 320–352. Cologne: Halem. 

Lin, Tsen-Fang, and Liang-Bi Chen. 2024. “Harmony and Algorithm: Exploring the 
Advancements and Impacts of AI-Generated Music.” IEEE Potentials 43 (6): 23–30. 

Lindgren, Simon. 2024. Critical Theory of AI. Cambridge: Polity. 
Manovich, Lev. 2001. The Language of New Media. Cambridge, MA: MIT Press. 

Manovich, Lev, and Emanuele Arielli. 2024. Artificial Aesthetics: Generative AI, Art 
and Visual Media. https://manovich.net/index.php/projects/artificial-aesthetics. 

Marx, Leo. 1997. “‘Technology’: The Emergence of a Hazardous Concept.” Social Re-
search 64 (3): 965–988. 

Massumi, Brian. 1995. “The Autonomy of Affect.” Cultural Critique 31: 83–109. 

Kirschenbaum, Matthew. 2025. “The US of AI,” Public Draft, February 25, 2025. 
https://drive.google.com/file/d/1O2qkjhg7Ei5zZWmBraNwXq4V0lTauspN/view? 
fbclid=IwY2xjawIta99leHRuA2FlbQIxMQABHaYasYRdObXQMDhxLA663f-ol
OolfNYK5ZXWLyBJxOBGkuKu_ol9i6d65A_aem_vFeS34lZx5krTC3F24tqGQ.

Manovich, Lev. 2019. AI Aesthetics. Moscow: Strelka Press. https://manovich.net/ 
index.php/projects/ai-aesthetics.

https://www.theatlantic.com/technology/archive/2023/03/ai-chatgpt-writing-language-models/673318/
https://www.theatlantic.com/technology/archive/2023/03/ai-chatgpt-writing-language-models/673318/
https://drive.google.com/file/d/1O2qkjhg7Ei5zZWmBraNwXq4V0lTauspN/view?fbclid=IwY2xjawIta99leHRuA2FlbQIxMQABHaYasYRdObXQMDhxLA663f-olOolfNYK5ZXWLyBJxOBGkuKu_ol9i6d65A_aem_vFeS34lZx5krTC3F24tqGQ
https://www.wired.com/story/scammy-ai-generated-books-flooding-amazon
https://www.wired.com/story/scammy-ai-generated-books-flooding-amazon
https://www.digitaltrends.com/computing/fake-ai-images-showing-in-google-search
https://www.digitaltrends.com/computing/fake-ai-images-showing-in-google-search
https://manovich.net/index.php/projects/ai-aesthetics
https://manovich.net/index.php/projects/ai-aesthetics
https://manovich.net/index.php/projects/artificial-aesthetics
https://drive.google.com/file/d/1O2qkjhg7Ei5zZWmBraNwXq4V0lTauspN/view?fbclid=IwY2xjawIta99leHRuA2FlbQIxMQABHaYasYRdObXQMDhxLA663f-olOolfNYK5ZXWLyBJxOBGkuKu_ol9i6d65A_aem_vFeS34lZx5krTC3F24tqGQ
https://drive.google.com/file/d/1O2qkjhg7Ei5zZWmBraNwXq4V0lTauspN/view?fbclid=IwY2xjawIta99leHRuA2FlbQIxMQABHaYasYRdObXQMDhxLA663f-olOolfNYK5ZXWLyBJxOBGkuKu_ol9i6d65A_aem_vFeS34lZx5krTC3F24tqGQ


Introduction 19 

McLuhan, Marshall. 1964. Understanding Media: The Extensions of Man. New York: 
McGraw-Hill. 

McQuillan, Dan. 2022. Resisting AI: An Anti-Fascist Approach to Artificial Intelli-
gence. Bristol: Bristol University Press. 

Medlicott, Jenny. 2023. “‘Spanish Influencer’ Created Entirely by AI Generates Its Model-
ling Agency £9,000 a Month with 200,000 Followers.” LBC, December 4, 2023. 
https://www.lbc.co.uk/news/ai-artificial-intelligence-model-influencer-arts-nine- 
thousand-pounds/. 

Mersch, Dieter. 2024. “Medienästhetiken: Entwurf einer Systematisierung.” Interna-
tionales Jahrbuch für Medienphilosophie und Medienästhetik 2024: 203–228. 

Meyer, Roland. 2023. “The New Value of the Archive: AI Image Generation and the 
Visual Economy of ‘Style’.” IMAGE: The Interdisciplinary Journal of Image Sci-
ences 37 (1): 100–111. 

Meyer, Roland. 2024. “Spekulative Strategien: KI-Bilder, Memesis und wilde Foren-
sis.” Fotogeschichte 172: 38–44. 

Mitchell, William J.T. 1992. The Reconfigured Eye: Visual Truth in the Post-Photo-
graphic Era. Cambridge, MA: MIT Press. 

Mitchell, William J.T. 2013. “Foreword: Media Aesthetics.” In Thinking Media Aesthet-
ics: Media Studies, Film Studies and the Arts, edited by Liv Hausken, 15–27. Berlin: 
Peter Lang. 

Mitchell, William J.T., and Mark B.N. Hansen. 2010. “Introduction.” In Critical Terms 
for Media Studies, edited by William J.T. Mitchell and Mark B.N. Hansen, vii–xxii. 
Chicago: University of Chicago Press. 

Moruzzi, Caterina. 2020. “Can a Computer Create a Musical Work? Creativity and 
Autonomy of AI Software for Music Composition.” In The Age of Artificial Intelli-
gence: An Exploration, edited by Steven S. Gouveia, 161–176. Wilmington: Vernon 
Press. 

Nanay, Bence. 2016. Aesthetics as Philosophy of Perception. Oxford: Oxford Univer-
sity Press. 

Navas, Eduardo. 2023. The Rise of Metacreativity: AI Aesthetics after Remix. New 
York: Routledge. 

Nayar, Vilasini. 2025. “The Ethics of AI Generated Music: A Case Study on Suno AI.” 
GRACE: Global Review of AI Community Ethics 3 (1): 1–22. 

Nichols, Bill. 1991. Representing Reality: Issues and Concepts in Documentary. 
Bloomington: Indiana University Press. 

Novitz, David. 1977. Pictures and Their Use in Communication: A Philosophical 
Essay. The Hague: Nijhoff. 

Offert, Fabian. 2023. “On the Concept of History (in Foundation Models).” IMAGE: 
The Interdisciplinary Journal of Image Sciences 37 (1): 121–134 

Offert, Fabian, and Ranjodh Singh Dhaliwal. 2024. “The Method of Critical AI 
Studies, A Propaedeutic.” arXiv:2411.18833v1, November 28, 2024. https://doi. 
org/10.48550/arXiv.2411.18833. 

Pasquinelli, Matteo. 2023. The Eye of the Master: A Social History of Artificial Intel-
ligence. London: Verso. 

Raley, Rita, and Jennifer Rhee (eds.). 2023. Critical AI. Special issue American Litera-
ture 95 (2). 

Roberge, Jonathan, and Michael Castelle (eds.). 2021. The Cultural Life of Machine 
Learning: An Incursion into Critical AI Studies. Cham: Palgrave Macmillan. 

https://www.lbc.co.uk/news/ai-artificial-intelligence-model-influencer-arts-nine-thousand-pounds/
https://www.lbc.co.uk/news/ai-artificial-intelligence-model-influencer-arts-nine-thousand-pounds/
https://doi.org/10.48550/arXiv.2411.18833
https://doi.org/10.48550/arXiv.2411.18833


20 Jan-Noël Thon and Lukas R.A. Wilde 

Robison, Greg. 2025. “Tokens Not Noise: How GPT-4o’s Approach Changes Everything 
About AI Art.” Medium, April 1, 2025. https://gregrobison.medium.com/tokens-not-
noise-how-gpt-4os-approach-changes-everything-about-ai-art-99ab8ef5195d. 

Rodowick, David Norman. 2007. The Virtual Life of Film. Cambridge, MA: Harvard 
University Press. 

Romele, Alberto. 2024. Digital Habitus: A Critique of the Imaginaries of Artificial 
Intelligence. New York: Routledge. 

Sachs-Hombach, Klaus. 2003. Das Bild als kommunikatives Medium: Elemente einer 
allgemeinen Bildwissenschaft. Cologne: Halem. 

Salvaggio, Eryk. 2023. “How to Read an AI Image: Toward a Media Studies Methodol-
ogy for the Analysis of Synthetic Images.” IMAGE: The Interdisciplinary Journal of 
Image Sciences 37 (1): 83–99 

Salvaggio, Eryk. 2025. “Anatomy of an AI Coup.” Tech Policy.Press, February 9, 2025. 
https://www.techpolicy.press/anatomy-of-an-ai-coup/. 

Schröter, Jens. 2019a. “Media Aesthetics, Simulation, and the New Media.” MediArXiv 
Preprints, March 29, 2019. https://osf.io/preprints/mediarxiv/bs2zu. 

Schröter, Jens. 2019b. “Media and Abstraction.” Medienkomparatistik: Beiträge zur 
Vergleichenden Medienwissenschaft 1 (1): 21–35. 

Schuhmann, Christoph, Romain Beaumont, Richard Vencu, et al. 2022. “LAION-5B: 
An Open Large-Scale Dataset for Training Next Generation Image-Text Models.” 
arXiv:2210.08402, October 16, 2022. https://doi.org/10.48550/arXiv.2210.08402. 

Schwartz Dona. 1992. To Tell the Truth: Codes of Objectivity in Photojournalism. Min-
neapolis: Gordon and Breach. 

Seel, Martin. 2005. Aesthetics of Appearing. Stanford: Stanford University Press. 
Shklovsky, Viktor. 2012. “Art as Technique.” In Russian Formalist Criticism: Four 

Essays, edited by Lee T. Lemon and Marion J. Reis, 21–34. 2nd ed. Lincoln, NE: 
University of Nebraska Press. 

Škripcová, Lucia Novanská. 2024. “Participative Culture in AI Models: Case Study of 
Stable Diffusion.” In Marketing Identity: Human vs. Artificial: Conference Proceed-
ings from the International Scientific Conference 12th November 2024, edited by 
Monika Prostináková Hossová, Martin Solík, and Matej Martovič. 522–528. Trnava: 
University of Ss. Cyril and Methodiu in Trnava. 

Somaini, Antonio. 2023. “Algorithmic Images: Artificial Intelligence and Visual Cul-
ture.” Grey Room 93: 75–115. 

Song, Sophia, Joy Song, Junha Lee, Younah Kang, and Hoyeon Moon. 2024. “Exploring 
the Potential of Novel Image-to-Text Generators as Prompt Engineers for CivitAI 
Models.” In Proceedings of the 16th IIAI International Congress on Advanced Ap-
plied Informatics (IIAI-AAI), 626–631. 

Spöhrer, Markus, and Beate Ochsner (eds.). 2017. Applying the Actor-Network Theory 
in Media Studies. Hershey: IGI Global. 

Thielmann, Tristan, and Erhard Schüttpelz (eds.). 2013. Akteur-Medien-Theorie. Biele-
feld: transcript. 

Thomson-Jones, Katherine. 2005. “Inseparable Insight: Reconciling Cognitivism 
and Formalism in Aesthetics.” The Journal of Aesthetics and Art Criticism 63 (4): 
375–384. 

Thon, Jan-Noël. 2025. “Postdigital Aesthetics in Recent Indie Games.” In Videogames 
and Metareference: Mapping the Margins of an Interdisciplinary Field, edited by 
Theresa Krampe and Jan-Noël Thon, 221–283. New York: Routledge. 

https://gregrobison.medium.com/tokens-not-noise-how-gpt-4os-approach-changes-everything-about-ai-art-99ab8ef5195d
https://gregrobison.medium.com/tokens-not-noise-how-gpt-4os-approach-changes-everything-about-ai-art-99ab8ef5195d
https://www.techpolicy.press/anatomy-of-an-ai-coup/
https://osf.io/preprints/mediarxiv/bs2zu
https://doi.org/10.48550/arXiv.2210.08402


Introduction 21 

Walton, Kendall L. 1984. “Transparent Pictures: On the Nature of Photographic Real-
ism.” Critical Inquiry 11: 246–277. 

Wasielewski, Amanda. 2024. “Unnatural Images: On AI-Generated Photographs” Criti-
cal Inquiry 51 (1): 1–29. 

Watkins, Gareth. 2025. “AI: The New Aesthetics of Fascism.” New Socialist, February 9, 
2025. https://newsocialist.org.uk/transmissions/ai-the-new-aesthetics-of-fascism/. 

Wilde, Lukas R.A. 2021. “Klaus Sachs-Hombach.” In The Palgrave Handbook of Im-
age Studies, edited by Krešimir Purgar, 873–888. Cham: Palgrave Macmillan. 

Wilde, Lukas R.A. 2023. “Generative Imagery as Media Form and Research Field: 
Introduction to a New Paradigm.” IMAGE: The Interdisciplinary Journal of Image 
Sciences 37 (1): 6–33. 

Wilde, Lukas R.A. 2025. “KI-Bilder und die Widerständigkeit der Medienkonvergenz: 
Von primärer zu sekundärer Intermedialität.” In Bilder im Aufbruch: Herausforde-
rungen der Bildwissenschaft, edited by Marcel Lemmes, Stephan Packard, and Klaus 
Sachs-Hombach, 475–507. Cologne: Halem. 

Wirth, Sabine. 2016. “Between Interactivity, Control, and ‘Everydayness’: Towards 
a Theory of User Interfaces.” In Interface Critique, edited by Florian Hadler and 
Joachim Haupt, 17–35. Berlin: Kadmos. 

Wirth, Sabine. 2023. “Interfaces of AI: Two Examples from Popular Media Culture 
and Their Analytical Value for Studying AI in the Sciences.” In Beyond Quantity: 
Research with Subsymbolic AI, edited by Andreas Sudmann, Anna Echterhölter, 
Markus Ramsauer, Fabian Retkowski, Jens Schröter, and Alexander Waibel, 217– 
233. Bielefeld: transcript. 

Wollheim, Richard. 2001. “On Formalism and Pictorial Organization.” The Journal of 
Aesthetics and Art Criticism 59 (2): 127–137. 

Xu, Ziwei, Sanjay Jain, and Mohan Kankanhalli. 2024. “Hallucination Is Inevitable: An 
Innate Limitation of Large Language Models.” arXiv:2401.11817, January 22, 2024. 
https://doi.org/10.48550/arXiv.2401.11817. 

Zangwill, Nick. 2001. The Metaphysics of Beauty. Ithaca, NY: Cornell University Press. 
Zylinska, Joanna. 2020. AI Art: Machine Visions and Warped Dreams. London: Open 

Humanities Press. 

https://newsocialist.org.uk/transmissions/ai-the-new-aesthetics-of-fascism/
https://doi.org/10.48550/arXiv.2401.11817


DOI: 10.4324/9781003676423-2 

AI Horseplay 
Postdigital Aesthetics in 
AI-Generated Images 

Jan-Noël Thon 

Introduction 

Despite their comparatively recent emergence,1 diffusion-based AI image 
generators such as DALL·E, Midjourney, or Stable Diffusion have already 
substantially reconfigured our contemporary media culture, not least leading 
to a flurry of more or less hurried attempts to come to theoretical terms with 
what is then variously described as “AI-imagenesis” (Ervik 2023, 45), “au-
tolography” (Chesher and Albarrán-Torres 2023, 58), “operative ekphrasis” 
(Bajohr 2024, 77), “predictive media” (Manovich 2023, 36), “synthetic im-
ages” (Salvaggio 2023, 83), or (most commonly) “AI imagery,” “generative 
imagery,” and “AI-generated images.”2 Resisting the rhetorics of novelty that 
can prominently be observed in the popular as well as academic discourses 
surrounding generative AI, this chapter aims to explore some of the ways in 
which AI-generated images may manifest what could be described as post-
digital aesthetics—while also emphasizing that such a postdigital aesthetics 
is not exclusive to AI-generated images, but can similarly be attributed to 
a range of other media forms.3 To this end, the chapter begins with a brief 
explication of the terms “postdigital,” “aesthetics,” and “postdigital aesthet-
ics,” distinguishing four salient domains of the latter that can be specified 
as the aesthetic intensification of the digital, the aesthetic transfer from the 
digital to the nondigital, the aesthetic intensification of the nondigital, and 
the aesthetic transfer from the nondigital to the digital. This is followed by an 
equally brief discussion of postdigital aesthetics in terms of remediation and 
of the affordances of diffusion-based AI image generators such as DALL·E, 
Midjourney, or Stable Diffusion, all of which can be prompted to create AI-
generated images with both a more or less specific representational content 
and a more or less specific aesthetic form. Against this background, the chap-
ter analyzes the aesthetic transfer from the nondigital to the digital and the 
aesthetic intensification of the digital (as the two domains of postdigital aes-
thetics that are particularly relevant here) in a small corpus of AI-generated 
images of galloping horses that were created using ChatGPT 4o in August 
2024, and which—despite the necessarily heuristic and qualitative nature of 

2 

This chapter has been made available under a CC-BY-NC-ND 4.0 license.

https://doi.org/10.4324/9781003676423-2


AI Horseplay 23 

this approach—arguably allow us to at least “catch a glimpse” of the postdigi-
tal aesthetics that DALL·E affords its users more or less “by default.” 

Conceptualizing Postdigital Aesthetics 

Let us begin, then, with a brief explication of the terms “postdigital,” 
“aesthetics,” and “postdigital aesthetics.” The term “postdigital” was coined 
a quarter of a century ago by Cascone (2000), on the one hand, and Pepperell 
and Punt (2000), on the other, with the former having turned out rather more 
influential than the latter. Cascone takes Negroponte’s (1998) observation that 
the so-called digital revolution is over as the starting point for the diagnosis 
of a specific “‘post-digital’ aesthetic” that manifests itself as an “aesthetics 
of failure” (Cascone 2000, 12) in electronic music. According to Cascone, 
this “aesthetics of failure” can be understood as “a result of the immersive 
experience of working in environments suffused with digital technology” 
(2000, 12) in that it incorporates “glitches, bugs, application errors, system 
crashes, clipping, aliasing, distortion, quantization noise, and even the noise 
floor of computer sound cards” (2000, 13). The notion of the postdigital and 
of a specifically postdigital aesthetics then initially circulated primarily in the 
discourse fields of electronic music and media art, but has received increasing 
academic attention since the 2010s and is now employed not only in artistic 
and practice-oriented contexts (see, e.g., Bishop et al. 2016; Paul 2016) but 
also in disciplines and research fields as diverse as sound studies (see, e.g., 
Ford 2023; Kouvaras 2016), literary studies (see, e.g., Abblitt 2018; Hamel 
and Stubenrauch 2023), theater studies (see, e.g., Causey 2016; Papagian-
nouli 2022), media studies (see, e.g., Diecke et al. 2022; Murray 2020), and 
education research (see, e.g., Hayes 2021; Mathier 2023) as an alternative to 
talking about “digit(al)ization” (see, e.g., Balbi and Magaudda 2018). Based 
on the diagnosis of the increasing ubiquity of digital technology in everyday 
life that was already present(ed) in Cascone’s remark that “[t]he tendrils of 
digital technology have in some way touched everyone” (2000, 12) as well as 
in Pepperell and Punt’s argument that “the intellectual restrictions of the digi-
tal paradigm are now becoming unavoidable” (2000, 2), much of the existing 
research on the postdigital stresses that “the historical distinction between 
the digital and the nondigital becomes increasingly blurred” (Berry 2014, 22; 
Berry and Dieter 2015b, 2; see also, e.g., Arndt et al. 2019; Contreras-Koter-
bay and Mirocha 2016; Jandrić et al. 2018; Jordan 2020). 

The distinction between “the digital” and “the nondigital” that is invoked 
here evidently does not coincide with the more precise distinction in media the-
ory and philosophy between “digital-in-the-sense-of-discrete” and “analog-in-
the-sense-of-continuous” (see, e.g., Fazi 2019; Schröter 2004; but also Frigerio 
et al. 2013; Maley 2023), instead referring—less precisely, but more compatible 
with everyday usage—to the presence or absence of “computer technology,” 


24 Jan-Noël Thon 

broadly conceived (see Cramer 2015; as well as, e.g., Cubitt 2006; Maley 
2011). Moreover, the prefix “post” in the term “postdigital” by no means de-
notes the end of the digital or the disappearance of digital technology—rather, 
it stresses the increased significance and fine-grained everyday integration of 
digital technology after the so-called digital revolution, which has led to a de-
creased saliency of the distinction between digital and nondigital technologies, 
practices, and artifacts in everyday life. The term “postdigital” can therefore 
be compared to terms such as “poststructuralism,” “postmodernism,” “postco-
lonialism,” or “postpunk” as well as “post-photography” (see, e.g., Mitchell 
1992), “post-cinema” (see, e.g., Denson and Leyda 2016), “postmedia” (see, 
e.g., Apprich et al. 2013), or “postinternet” (see, e.g., Rothwell 2024), all of 
which broadly refer to the transformation of what has existed up to a point, 
while critically acknowledging that what has existed up to that point still re-
mains impactful. That said, although the blurring of the boundary between 
digital and nondigital technologies, practices, and artifacts is a common thread 
throughout existing conceptualizations of the postdigital, these conceptual-
izations still differ substantially across disciplinary contexts as well as from 
scholar to scholar, with various contributions positioning the postdigital as an 
“umbrella term” or otherwise multilayered concept, and at least some theorists 
also more or less systematically distinguishing between or at least hinting at 
the existence of distinct dimensions, aspects, or domains of the postdigital (see, 
e.g., Jordan 2020; Taffel 2016; as well as the notable differences between how 
the postdigital is conceptualized in Cascone 2000 and in Cascone and Jandrić 
2021, or in Cramer 2015 and in Cramer and Jandrić 2021). For our present 
purpose, however, it mainly seems important not only to note that the ubiquity 
of digital technology has shifted, blurred, or dissolved the border between the 
digital and the nondigital (as well as between “being online” and “being off-
line” [see, e.g., Berry 2014]) but also to ask which new(ish) practices, arti-
facts, and experiences such a shift, blurring, or dissolution of these established 
borders has led to as part of the “messy state of media, arts and design after 
their digitization” (Cramer 2015, 19; original emphasis). Indeed, one (though 
certainly not the only) central strand of discussion within research on the post-
digital has been the reconfigured relation between “old” nondigital media and 
“new” digital media that includes a particular interest in “hybrids of ‘old’ and 
‘new’ media” (Cramer 2015, 20) as well as in how “‘old’ media [are] used 
like ‘new media’” (Cramer 2015, 21; see also, e.g, Hansen 2004; Manovich 
2001 on the concept of “new media”). The postdigital can then be understood 
as “a ‘coming together,’ a hybridisation of both the digital and the non-digital 
domains” that includes “the movement of the non-digital to the digital and the 
digital to the non-digital,” “operat[ing] from two states or positions: within or 
across the digital/non-digital nexus” (Jordan 2020, 63). 

However, despite most discussions of the postdigital drawing on Cas-
cone’s foundational reflections on a “‘post-digital’ aesthetic” (2000, 12) in 
electronic music at least to some extent, there is comparatively little explicit 
discussion of aesthetic questions in the existing research. Hence, let us unpack 


AI Horseplay 25 

in slightly more detail the conceptualization of “aesthetics” that underlies the 
approach to postdigital aesthetics presented here. First, it should be noted that 
this approach is not primarily concerned with aesthetic judgments (or with the 
concept of art4), nor with “evaluatively laden aesthetic properties” (Levinson 
2001, 76) such as beauty (or ugliness), though the analysis of postdigital aes-
thetics will still need to include (particular) “aesthetically relevant properties” 
(Nanay 2016, 67) that make a difference with regard to aesthetic perception, 
aesthetic experience, or aesthetic appreciation (see also, e.g., Eaton 2001; Ir-
vin 2014; Nanay 2016; Seel 2005). Second, while aesthetic perception would 
have to be at the center of any appropriately “nonnormative” aesthetics, the 
proposed conceptualization of postdigital aesthetics does not conflate aesthet-
ics with aisthesis (or aisthetics). It is, of course, quite common to emphasize 
the connection between aesthetics and perception in philosophical aesthetics 
(see, e.g., Böhme 2001; Nanay 2016; Rancière 2011; Welsch 1987) as well 
as in the broader research on media and postdigital aesthetics (see, e.g., Con-
treras-Koterbay and Mirocha 2016; Cramer 2015; Hausken 2013; Marchiori 
2013), but it seems preferable to maintain a distinction between aesthetic and 
nonaesthetic (or functional, or pragmatic) perception that might, for example, 
be specified via the former’s “self-referentiality” or “sensing self-awareness” 
tying “[t]he special presence of the object of perception […] to a special pres-
ence of the exercise of this perception” (Seel 2005, 31; original emphases). 
Third, even if we can understand aesthetics as a perceptual (or, more broadly, 
experiential) category, the following is primarily concerned with the aesthetic 
form of medial artifacts to which a postdigital aesthetics can be attributed, 
which broadly refers to the external Gestalt of such artifacts that is accessible 
to perception as a result of a “particular way of manipulating the materials […] 
of its medium” (Eldridge 1985, 313), and which might in various contexts be 
distinguished from the representational content of those medial artifacts that 
fulfill representational functions.5 Even if “form has never belonged only to 
the discourse of aesthetics” (Levine 2015, 2) and the term therefore (once 
more) has a rather complex conceptual history, most if not “all the historical 
uses of the term” do seem to share a common conceptual core in that “‘form’ 
always indicates an arrangement of elements” that could also be described as 
“an ordering, patterning, or shaping” (Levine 2015, 3; original emphases) 
and that, again, becomes aesthetic if it is (in some way) accessible to percep-
tion. Fourth and finally, since medial artifacts instigating aesthetic perception 
are made (at least partially) by humans (although the part that humans play 
in the creation of AI-generated images may be seen as comparatively lim-
ited, and aesthetic objects that are not artifacts do of course also possess an 
aesthetic form and can instigate aesthetic perception), aesthetic practice(s) as 
the “localized practices of artefactual construction” (Corner 2019, 108) that 
have brought the medial artifacts in question into existence would also need 
to be taken into account. Evidently, the concept of aesthetic practice(s) as a 
whole cannot be reduced to such “localized practices of artefactual construc-
tion,” instead also including the aforementioned “practices of self-referential 


26 Jan-Noël Thon 

perception” (Reckwitz 2016, 63) sensu Seel (2005), but the terminological 
emphasis on aesthetic production practices rather than aesthetic reception 
practices is meant to highlight the need to include the former in any compre-
hensive analysis of postdigital aesthetics as well.6 

What about “postdigital aesthetics,” then? Building on the distinctions that 
Cramer (2015), Jordan (2020), and others draw with regard to the postdigital in 
toto, a comprehensively conceptualized postdigital aesthetics can be observed 
in four domains of the postdigital that are at least heuristically distinguishable 
from one another (see also Thon 2025; 2026/forthcoming). First, the term “post-
digital aesthetics” can refer to an aesthetic intensification of the digital that is 
already at the center of Cascone’s influential conceptualization of postdigital 
aesthetics as an “aesthetics of failure” (2000, 12) in electronic music, though 
both “postdigital aesthetics” and “aesthetics of failure” certainly expand well 
beyond primarily auditive media forms and particularly into the realm of the 
visual, where they are often discussed in the context of “[g]litch aesthetics, cor-
ruption artefacts, [and] retro 8-bit graphics” (Paul and Levy 2015, 31; see also, 
e.g., Betancourt 2017; Menkman 2011).7 Second, the term “postdigital aesthet-
ics” can refer to an aesthetic transfer from the digital to the nondigital that is, 
for example, often discussed with reference to James Bridle’s (2011) notion of 
a “new aesthetic,” to the extent that the latter broadly refers to “eruptions of the 
digital into the physical world” (Kwastek 2015, 74; see also, e.g., several other 
contributions in Berry and Dieter 2015a; as well as Contreras-Koterbay and 
Mirocha 2016; Hodgson 2019 for proposals to connect the “new aesthetic” to 
the concept of the postdigital).8 Third, the term “postdigital aesthetics” can refer 
to an aesthetic intensification of the nondigital that would, for example, include 
the (considered) prioritization of nondigital technologies, practices, and arti-
facts in contexts in which digital technologies, practices, and artifacts would be 
more readily available (say, when photographers or filmmakers use nondigital 
cameras and nondigital film material, even though using digital cameras would 
require “less of an effort”). Fourth and finally, the term “postdigital aesthetics” 
can refer to an aesthetic transfer from the nondigital to the digital that entails 
various ways in which digital aesthetic objects, medial artifacts, or, more specif-
ically, medial representations across media forms may evoke, simulate, or oth-
erwise recreate the conventionally recognizable aesthetics of nondigital media 
forms (see also, e.g., Bolter and Grusin 1999 on “remediation”; Rajewsky 2005 
on “intermedial references”; Schröter 2019; 2023 on “transmaterialization”).9 

Conceptualizing the Postdigital Aesthetics of 
AI-Generated Images 

So, even if the analytical focus of this chapter is on postdigital aesthetics as 
a set of (particular) “aesthetically relevant properties” (Nanay 2016, 65) that 
can be attributed to (elements of) the aesthetic form of various medial artifacts, 
most if not all of which can be further specified as medial representations,10 

(postdigital) aesthetic forms are always connected to the (postdigital) aesthetic 


AI Horseplay 27 

practices that these medial artifacts or medial representations are based on as 
well as to the (postdigital) aesthetic experiences that they afford their various 
recipients (and which will usually entail, but arguably cannot be reduced to aes-
thetic perception). Against the background of the proposed conceptualization of 
postdigital aesthetics with its heuristic distinction between four salient domains 
of the latter that can be specified as the aesthetic intensification of the digital, the 
aesthetic transfer from the digital to the nondigital, the aesthetic intensification 
of the nondigital, and the aesthetic transfer from the nondigital to the digital, 
however, it is worth stressing in slightly more detail that the approach to the 
analysis of postdigital aesthetics presented here is primarily concerned with a 
specific kind of medial representations, namely those medial representations 
that foreground their own mediality, materiality, and aesthetic form as opposed 
to their representational content. This does not mean that medial representations 
not foregrounding their own mediality and materiality in an immediately notice-
able way have no aesthetic form or cannot instigate aesthetic perception, but 
there still seems to be a connection between the “sensing self-awareness” (Seel 
2005, 31) of aesthetic perception and the self-referentiality of medial represen-
tations that foreground their own mediality, materiality, and aesthetic form. That 
said, the distinction between the aesthetic form of medial representations and 
their representational content as well as the “self-referential” foregrounding of 
the former can be specified further in various different ways.11 

As hinted at above, a particularly influential conceptualization of this kind 
of foregrounding has been developed by Bolter and Grusin (1999), who not 
only argue, following McLuhan (1964), that so-called new media remediate 
the “content” and “form” of older media in various ways, but who also pos-
tulate a “double logic of remediation” (Bolter and Grusin 1999, 31), which 
amongst other things allows us to locate concrete medial representations 
between the poles of transparent “immediacy” and opaque “hypermediacy.” 
While the term “immediacy” broadly refers to the deemphasizing of the aes-
thetic form of a medial representation compared to its representational content 
that “either […] erase[s] or […] render[s] automatic the act of representation” 
(Bolter and Grusin 1999, 33) and is often explained using the metaphor of a 
transparent window, the term “hypermediacy” refers to medial representations 
that foreground “acts of representation and mak[e] them visible,” “multipl[y] 
the signs of mediation” (Bolter and Grusin 1999, 34), and thus draw our at-
tention to their mediality, materiality, and aesthetic form. An interplay of 
transparent immediacy and opaque hypermediacy can be observed in very 
different medial representations across conventionally distinct media forms, 
but it would seem that medial representations which emphasize the “logic of 
hypermediacy” more strongly than the “logic of immediacy” are particularly 
interesting for the question of postdigital aesthetics—and, again, perhaps also 
tend to more readily instigate aesthetic as opposed to “merely” nonaesthetic, 
functional, or pragmatic processes of perception in their recipients. 

Returning to the question of postdigital aesthetics, we can further observe 
that medial representations whose aesthetic form emphasizes the logic of 


28 Jan-Noël Thon 

opaque hypermediacy as opposed to the logic of transparent immediacy and, 
therefore, at least tends to privilege aesthetic as opposed to “merely” nonaes-
thetic, functional, or pragmatic perception can be found in a broad range of 
conventionally distinct media forms, including (digital as well as nondigital) 
literary texts, comics, animation, photography, films, series, and games. While 
other avenues of inquiry are certainly available, then, the remainder of this 
chapter will focus on the particular kind of postdigital aesthetics afforded by 
diffusion-based AI image generators such as DALL·E, Midjourney, or Stable 
Diffusion, all of which can be prompted to create AI-generated images not only 
with a more or less specific representational content that is often described as 
the “subject” of these images but also with a more or less specific aesthetic 
form that is often described in terms of their “style.” Meyer in particular con-
vincingly argues that the resulting “logic of the prompt radically expands and 
de-hierarchizes the notion of sty