2023 | 37
Browsing 2023 | 37 by Title
Now showing 1 - 15 of 15
Results Per Page
Sort Options
- ArticleAI Body Images and the Meta-Human: On the Rise of AI-generated Avatars for Mixed Realities and the Metaverse.Scorzin, Pamela C. (2023) , S. 179-194In this paper, I discuss the impact of artificial intelligence (AI) on contemporary visual culture, mainly on the human (body) imagery and the forming of AI avatar design for social media and beyond, i.e., for mixed realities and the Metaverse. What kind of representations of humans does Artificial Intelligence generate? I use AI imagery as an umbrella term, including prompt engineering. What do algorithmic images created by contemporary AI image generators like Midjourney, DALL·E 2, or Stable Diffusion, among others, represent? What kind of reality do they depict? And to which ideologies and contemporary body concepts do they refer? Moreover, we can observe a visual paradox herein: The more realistic the AI images created by GANs and Diffusion models within AI image generators now appear, the less clear becomes their reference to reality and any truth content. However, what synthetic images created by intelligent algorithms depict is seen as something other than unreal and fictitious since what becomes visible refers to information minted from the metadata of vast amounts of circulating images (on the internet). Making the invisible visible and distributing it via digital platforms becomes the act of communicating with AI images that ‘inform’ and affect their recipients by creating real resonance. The timeline of this new photo-based imaging technology points more to the future than to the present and past. Thus, AI images as meta-images can represent a different form or level of reality in a simulated photo-realistic style that functions as effective visual rhetoric for globally networked communities of the present. Moreover, in the age of cooperation and co-creation between man and machine within complex networks, the designing process can now start just with the command line prompt “/imagine” (Midjourney) – transforming the following text/ekphrases into an operative means of design/artistic productions. AI images are thus also operative images turning into a new technology-based visual language emerging from a large technological network. As networked images and meta-images, they can fabricate and fabulate the meta-human.
- ArticleAI Generative Art as Algorithmic RemediationBolter, Jay David (2023) , S. 195-207As the essays in this collection demonstrate, AI generative imagery raises compelling theoretical and historical questions for media studies. One fruitful approach is to regard these AI systems as a medium rooted in the principle of remediation, because the AI models depend on vast numbers of samples of other media (painting, drawing, photography, and textual captions) scraped from the web. This algorithmic remediation is related to, but distinct from earlier forms of remix, such as hip-hop. To generate new images from the AI models, the user types in a textual prompt. The resulting text-image pairs constitute a kind of metapicture, as defined by William J.T. Mitchell in Picture Theory (1994).
- ArticleAI Image Media through the Lens of Art and Media HistoryManovich, Lev (2023) , S. 34-41I’ve been using computer tools for art and design since 1984 and have already seen a few major visual media revolutions, including the development of desktop media software and photorealistic 3D computer graphics and ani- mation, the rise of the web after, and later social media sites and advances in computational photography. The new AI ‘generative media’ revolution appears to be as significant as any of them. Indeed, it is possible that it is as significant as the invention of photography in the nineteenth century or the adoption of linear perspective in western art in the sixteenth. In what follows, I will discuss four aspects of AI image media that I believe are particularly significant or novel. To better understand these aspects, I situate this media within the context of visual media and human visual arts history, ranging from cave paintings to 3D computer graphics.
- ArticleThe AI Image, the Dream, and the Statistical UnconsciousSchröter, Jens (2023) , S. 112-120As has been remarked several times in the recent past, the images generated by AI systems like DALL·E, Stable Diffusion, or Midjourney have a certain surrealist quality. In the present essay I want to analyze the dreamlike quality of (at least some) AI-generated images. This dreaminess is related to Freud’s com- parison of the mechanism of condensation in dreams with Galton’s composite photography, which he reflected explicitly with regard to statistics – which are also a basis of today’s AI images. The superimposition of images results at the same time in generalized images of an uncanny sameness and in a certain blurriness. Does the fascination of (at least some) AI-generated images result in their relation to a kind of statistical unconscious?
- ArticleAI in Scientific Imaging: Drawing on Astronomy and Nanotechnology to Illustrate Emerging Concerns About Generative KnowledgeMichos, Konstantinos (2023) , S. 165-178Recent advances in AI technology have enabled an unprecedented level of control over the processing of digital images. This breakthrough has sparked discussions about many potential issues, such as fake news, propaganda, the intellectual property of images, the protection of personal data, and possible threats to human creativity. Susan Sontag (2005 [1977]) recognized the strong causal relationship involved in the creation of photographs, upon which scientific images, rely to carry data (cf. cromEy 2012). First, this essay is going to present a brief overview of the AI image generative techniques and their status within the rest of computational methodologies employed in scientific imaging. Then it will outline their implementation in two specific examples: The Black Hole image (cf.EVENT horIZoN TELEscoPE coLLABorATIoN 2019a-f) and medical imagery (cf., e.g., orEN et al. 2020). Finally, conclusions will be drawn regarding the epistemic validity of AI images. Considering the exponential growth of available experimental data, scientists are expected to resort to AI methods to process it quickly. An overreliance on AI lacking proper ethics will not only result in academic fraud (cf. GU et al. 2022; wANG et al. 2022) but will also expose an uninitiated public to images where a lack of sufficient explanation can shape distorted opinions about science.
- ArticleDumb Meaning: Machine Learning and Artificial SemanticsBajohr, Hannes (2023) , S. 58-70The advent of advanced machine learning systems has often been debat- ed in terms of the very ‘big’ concepts: intentionality, consciousness, intelligence. But the technological development of the last few years has shown two things: that a human-equivalent AI is still far away, if it is ever possible; and that the philosophically most interesting changes occur in nuanced rather than overar- ching concepts. The example this contribution will explore is the concept of a limited type of meaning – I call it dumb meaning. For the longest time, computers were understood as machines computing only syntax, while their semantic abil- ities were seen as limited by the ‘symbol grounding problem’: Since computers operate with mere symbols without any indexical relation to the world, their understanding would forever be limited to the handling of empty signifiers, while their meaning is ‘parasitically’ dependent on a human interpreter. This was true for classic or symbolic AI. With subsymbolic AI and neural nets, how- ever, an artificial semantics seems possible, even though it still is far away from any comprehensive understanding of meaning. I explore this limited semantics, which has been brought about by the immense increase of correlated data, by looking at two examples: the implicit knowledge of large language models and the indexical meaning of multimodal AI such as DALL·E 2. The semantics of each process may not be meaning proper, but as dumb meaning it is far more than mere syntax.
- ArticleEditorial zur IMAGE 37Plaum,Goda; Grabbe, Lars; Sachs-Hombach, Klaus (2023) , S. 2-5
- ArticleFuzzy Ingenuity: Creative Potentials and Mechanics of Fuzziness in Processes of Image Creation with AI-Based Text-to-Image GeneratorsFeyersinger, Erwin; Kohmann, Lukas; Pelzer, Michael (2023) , S. 135-149This explorative paper focuses on fuzziness of meaning and visual rep- resentation in connection with text prompts, image results, and the mapping between them by discussing the question: How does the fuzziness inherent in artificial intelligence-based text-to-image generators such as DALL·E 2, Midjour- ney, or Stable Diffusion influence creative processes of image production – and how can we grasp its mechanics from a theoretical perspective? In addressing these questions, we explore three connected interdisciplinary approaches: (1) Text-to-image generators give new relevance to Hegel’s notion of language as ‘the imagination which creates signs’. They reinforce how language itself inevitably acts as a meaning-transforming system and extend the formative dimension of language with a technology-driven facet. (2) From the perspective of speech act theory, we discuss this explorative interaction with an algorithm as performative utterances. (3) In further examining the pragmatic dimension of this interac- tion, we discuss the creative potential arising from the visual feedback loops it includes. Following this thought, we show that the fuzzy variety of images which DALL·E 2 presents in response to one and the same text prompt contributes to a highly accelerated form of externalized visual thinking.
- ArticleGenerative AI and the Collective Imaginary: The Technology-Guided Social Imagination in AI-ImagenesisErvik, Andreas (2023) , S. 42-57This paper explores generative AI images as new media through the central questions: What do AI-generated images show, how does image generation (imagenesis) occur, and how might AI influence notions of the imaginary? The questions are approached with theoretical reflections on other forms of image production. AI images are identified here as radically new, distinct from earlier forms of image production as they do not register light or brushstrokes. The images are, however, formed from the stylistic and media technological remains of other forms of image production, from the training material to the act of prompting – the process depends on a connection between images and words. AI image generators take the form of search engines in which users enter prompts to probe into the latent space with its virtual potential. Agency in AI imagenesis is shared between the program, the platform holder, and the users’ prompting. Generative AI is argued here as creating a uniquely social form of images, as the images are formed from training datasets comprised of human created and/ or tagged images as well as shared on social networks. AI image generation is further conceptualized as giving rise to a near-infinite variability, termed a ‘machinic imaginary’. Rather than comparable to an individualized human imagination, this is a social imaginary characterized by the techniques, styles, and fantasies of earlier forms of media production. AI-generative images add themselves to and become an acquisition of the reservoirs of this already existing collective media imaginary. Since the discourse on AI images is so preoccupied with what the technology might become capable of, the AI imaginary would seem to also be filled with dreams of technological progress.
- ArticleGenerative AI and the Next Stage of Fan ArtLamerichs, Nicolle (2023) , S. 150-164Generative AI is on the rise due to the recent popularity of tools such as DALL·E, Midjourney, and Stable Diffusion. While GAN technology has a longer history, the subsequent Diffusion models are now widely embraced to generate new images in diverse styles. The rise of generative images has resulted in new forms of art and content that already made an impact on different industries. In fan culture, for instance, the use of generative AI has been exploding to create new images, fan art, and memes. In this essay, I specifically address the rise of generative AI from a fan studies and media studies perspective and consider the reception of AI within fandom. Fan cultures are increasingly data-driven participatory cultures, dependent on new media platforms and software. Generative art offers many possibilities to create transformative works based on our favorite characters and stories. In communities such as on Reddit, users share their generative art as well as tips and tricks to use these tools in optimal ways. However, generative fan art has also led to discussion in fandom, especially in terms of ethics, copyright, and monetization. Fans are, for instance, concerned about their art being used as training data without their permission. In this essay, I analyze how artists and other stakeholders discuss and regulate generative AI within their communities, for instance through bans of AI-generated art at fan conventions. While AI allows for many playful interactions and inspiring outcomes, users are especially critical of generative images being turned into a business model. While AI can empower and inspire artistic practice, there are clear concerns around these tools and their potential misuse. Fandom served as a case to better understand how users grapple with the innovative potential and challenges of generative AI.
- ArticleGenerative Imagery as Media Form and Research Field: Introduction to a New ParadigmWilde, Lukas R.A. (2023) , S. 6-33This introduction to the collection “Generative Imagery: Towards a ‘New Paradigm’ of Machine Learning-Based Image Production” discusses whether – or to what respect – generative imagery represents a new paradigm for image production; and if that constitutes even a novel media form and an emerging research field. Specifically, it asks what a humanities approach to machine learning-based image generation could look like and which questions disciplines like media studies will be tasked to ask in the future. The essay first focuses on continuities and connections rather than on alleged radical shifts in media history. It then highlights some salient differences of generative imagery – not only in contrast to photography or painting but specifically to earlier forms of computer-generated imagery. Postulating a ‘new paradigm’ will thus be based 1) on generative imagery’s emergent or stochastic features, 2) on two interrelated, but often competing entanglements of immediacy-oriented and hyper- mediacy-oriented forms of realisms, and 3) on a new text-image-relation built on the approximation of ‘natural’, meaning here human rather than machine code- based language. The survey closes with some reflections about the conditions under which to address this imagery as a distinct media (form), instead of ‘merely’ as a new technology. The proposal it makes is to address generative imagery as a form of mediation within evolving dispositifs, assemblages, or socio-techno- logical configurations of image generation that reconfigure the distribution of agency and subject positions within contemporary media cultures – especially between human and non-human (technological as well as institutional) actors. Of special importance to identify any (cultural) distinctness of generative imagery will thus be a praxeological perceptive on the establishment, attribution, and negotiation of cultural ‘protocols’ (conventionalized practices and typical use cases), within already existing media forms as well as across and beyond them.
- ArticleHow to Read an AI Image: Toward a Media Studies Methodology for the Analysis of Synthetic ImagesSalvaggio, Eryk (2023) , S. 83-99Image-generating approaches in machine learning, such as GANs and Diffusion, are actually not generative but predictive. AI images are data patterns inscribed into pictures, and they reveal aspects of these image-text datasets and the human decisions behind them. Examining AI-generated images as ‘info- graphics’ informs a methodology, as described in this paper, for the analysis of these images within a media studies framework of discourse analysis. This paper proposes a methodological framework for analyzing the content of these images, applying tools from media theory to machine learning. Using two case studies, the paper applies an analytical methodology to determine how information patterns manifest through visual representations. This methodology consists of generating a series of images of interest, following Roland Barthes’ advice that “what is noted is by definition notable” (BArThEs 1977: 89). It then examines this sample of images as a non-linear sequence. The paper offers examples of certain patterns, gaps, absences, strengths, and weaknesses and what they might suggest about the underlying dataset. The methodology considers two frames of intervention for explaining these gaps and distortions: Either the model imposes a restriction (content policies), or else the training data has included or excluded certain images, through conscious or unconscious bias. The hypothesis is then extended to a more randomized sample of images. The method is illustrated by two examples. First, it is applied to images of faces produced by the StyleGAN2 model. Second, it is applied to images of humans kissing created with DALL·E 2. This allows us to compare GAN and Diffusion models, and to test whether the method might be generalizable. The paper draws some conclusions to the hypotheses generated by the method and presents a final comparison to an actu- al training dataset for StyleGAN2, finding that the hypotheses were accurate.
- Article“Midjourney Can’t Count”: Questions of Representation and Meaning for Text-to-Image GeneratorsWasielewski, Amanda (2023) , S. 71-82Text-to-image generation tools, such as DALL·E, Midjourney, and Stable Diffusion, were released to the public in 2022. In their wake, communities of artists and amateurs sprang up to share prompts and images created with the help of these tools. This essay investigates two of the common quirks or issues that arise for users of these image generation platforms: the problem of repre- senting human hands and the attendant issue of generating the desired number of any object or appendage. First, I address the issue that image generators have with generating normative human hands and how DALL·E has tried to correct this issue by only providing generations of normative human hands, even when a prompt asks for a different configuration. Secondly, I address how this hand problem is part of a larger issue in these systems where they are unable to count or reproduce the desired number of objects in a particular image, even when explicitly prompted to do so. This essay ultimately argues that these common issues indicate a deeper conundrum for large AI models: the problem of rep- resentation and the creation of meaning.
- ArticleThe New Value of the Archive: AI Image Generation and the Visual Economy of ‘Style’Meyer, Roland (2023) , S. 100-111Text-to-image generators such as DALL·E 2, Midjourney, or Stable Dif- fusion promise to produce any image on command, thus transforming mere ekphrasis into an operational means of production. Yet, despite their seeming magical control over the results of image generation, prompts should not be understood as instructions to be carried out, but rather as generative search commands that direct AI models to specific regions within the stochastic spaces of possible images. In order to analyze this relationship between the prompt and the image, a productive comparison can be made with stock photography. Both stock photography databases and text-image generators rely on text descriptions of visual content, but while stock photography searches can only find what has already been produced and described, prompts are used to find what exists only as a latent possibility. This fundamentally changes the way value is ascribed to individual images. AI image generation fosters the emergence of a new net- worked model of visual economy, one that does not rely on closed, indexed image archives as monetizable assets, but rather conceives of the entire web as a freely available resource that can be mined at scale. Whereas in the older model each image has a precisely determinable value, what DALL·E, Midjourney, and Sta- ble Diffusion monetize is not the individual image itself, but the patterns that emerge from the aggregation and analysis of large ensembles of images. And maybe the most central category for accessing these models, the essay argues, has become a transformed, de-hierarchized, and inclusive notion of ‘style’: for these models, everything, individual artistic modes of expression, the visual stereotypes of commercial genres, as well as the specific look of older technical media like film or photography, becomes a recognizable and marketable ‘style’, a repeatable visual pattern extracted from the digitally mobilized images of the past.
- ArticleOn the Concept of History (in Foundation Models)Offert, Fabian (2023) , S. 121-134What is the concept of history inherent in contemporary models of visual culture like cLIP and DALL·E 2? This essay argues that, counter to the corporate interests behind such models, any understanding of history facilitated by them must be heavily politicized. This, the essay contends, is a result of a signif- icant technical dependency on traditional forms of (re-)mediation. Polemically, for cLIP and cLIP-dependent generative models, the recent past is literally black and white, and the distant past is actually made of marble. Moreover, proprie- tary models like DALL·E 2 are intentionally cut off from the historical record in multiple ways as they are supposed to remain politically neutral and culturally agnostic. One of the many consequences is a (visual) world in which, for instance, fascism can never return because it is, paradoxically at the same time, censored (we cannot talk about it), remediated (it is safely confined to a black-and-white media prison), and erased (from the historical record).