Safeguarding the Creativity of Authors

Storytelling Creativity Academic digression Company Values Dec 2, 2024 2:53:11 PM David Palfrey 39 min read

One crucial question for creative people right now is how generative AI is going to impact their creative activity. Creatives have a load of reasonable worries. Will large language models extend or reduce creative opportunity? How will generative AI be related to copyright or intellectual property law? Is there a danger that creatives will lose their livelihood, or keep their jobs but lose creative control? Will AI slush drive out quality content, like bad money driving out good? (A recent study found that non-experts couldn't easily distinguish AI poetry from human poetry, and on average preferred the AI poetry as more straightforward and accessible. Since most people aren't poetry experts, these preferences dominated the aggregate rating behaviour. However, amongst those whose educational exposure marked them as poetry experts, judgments went the other way: experts found the AI poetry to be too smooth and insipid.)

At Mind Mage, we want to see a future where human creativity continues to be rewarded and celebrated, and to do what we can to build towards that future. Our starting point is that creativity and technology have always been entangled with each other, and this will continue to be true. There's no way to stop or reverse the clock. But we should try to be as self-aware and transparent as possible. At every stage we can try, as best we can, to understand the consequences of the tools we are building, and build towards the future we want to see. And as software developers we should ensure that we build in collaboration with creative authors, rather than operating at an artificial distance from them.

There are all kinds of hopes and fears in this area right now, far too many to be addressed here. It's also always genuinely hard to foresee the effects of new media. Some fears that old media will be replaced are overblown, like the fears that cinema would replace stage plays. On the other hand, the talkies did entirely replace silent movies, ending employment for movie house orchestras and some silent stars. It seems unlikely that anyone in the world has a very good sense where all the chips will fall right now. This blog post will certainly only scratch the surface. But we hope it's interesting as a glimpse into the way we're thinking about things. And if you have thoughts or comments, let us know.

Substitutes and ComPlements, selection and combination

To get the discussion going, I'm going to put some economic vocabulary and some linguistic vocabulary next to each other. Economists sometimes talk about 'substitute goods' and 'complementary' goods. Margarine and butter are substitutes. Bread and butter are complements. Over the last hundred years economists have suggested various mathematical criteria to tell which is which, and 'elasticity' measures to quantify how much that's true. Meanwhile, in a different corner of the intellectual universe, structuralist linguists were busy making a very similar distinction. They thought that the meaning of any 'sign', such as a word or a phoneme, was basically a matter of the relation of that sign to other signs. And there were two basic ways in which signs related to each other: through 'selection' (e.g., the words 'cup' and mug', which could be selected instead of each other) and through 'combination' (e.g., the words 'dirty' and 'mug', which could be combined to make a phrase). If you wanted to be fancy, you could refer to this distinction between selection and combination in Greek, as an opposition between 'paradigmatic' and 'syntagmatic' sign-relations. However you referred to it, the distinction was claimed to shed light on all sorts of things. The Russian linguist Roman Jakobson (Two Aspects of Language and Two Kinds of Aphasia) argued it helped classify types of aphasia after brain injury. The English literary critic David Lodge, in The Modes of Modern Writing, labelled different generations of literary fiction by whether their predominant stylistic device was metaphor (selection) or metonymy (combination) .

So we have an economic anxiety: are generative AI tools going to work as complements or substitutes for human creativity? And this is simultaneously a question of how bits of language are going to co-exist: will it be combination or selection which predominates between machine and human signs?

At Mind Mage, we know what we want. We want to help ensure that generative AI is a complement, not a substitute, for human creativity. We want to build new possibilities for combination, rather than selection, between human creativity and generative AI. Of course, we know we can't ensure this happens all by ourselves. (A kind of holism was one theoretical message of both economics after Léon Walras and linguistics after Ferdinand de Saussure: how any two economic goods relate to one another depends on a bunch of other goods, and how any two linguistic signs relate to one another depends on a bunch of other signs.) But we can concentrate our own efforts on building tools which augment human creativity. And we can keep looking around to understand the changing context, and what opportunities and challenges there are.

Domains of creativity: modalities and human ecologies

DALL·E 2024-12-02 13.16.53 - A whimsical pen and ink cartoon with watercolor coloring, depicting a human pyramid of diverse creatives, where the characters are clearly balancing o

Creatives come in many shapes and sizes: poets, musicians, actors, film directors, novelists, painters, orchestral conductors, architects, voice actors, sculptors, and so on. At the most abstract level, these groups share some minimal common concerns. However, each group faces a different surface of challenge and opportunity when it comes to generative AI. What are the relevant dimensions of difference here, and what does our view look like at Mind Mage?

One relevant aspect of a creative domain with technical consequences is its modality. Different modalities require different low-level generative modelling techniques. Language models need to be sequence models, like transformer models, working with probability distributions over tokens in order to generate one token after another; image models, such as diffusion models, need to generate all the pixels in an image at once. These low-level differences are one reason why high-level affordances built on those techniques can differ across modalities.

At Mind Mage we are deliberately prioritizing the modalities of audio speech and written language. That's where we want to innovate and provide new interactive possibilities. Our aim is to help authors provide new kinds of immersive gaming experience through an audio channel. Our technical expertise and our aesthetic sensibilities are focused on conversational storytelling and what can be done with language. So, frankly, we've been happy to be relative klutzes when it comes to images. Using images is not an area where we want to center our business activity. It's possible this lack of conscious attention has led us into lazy habits. We've sometimes used gen AI images as a provisional placeholder - a kind of visual lorem ipsum - to bootstrap ourselves and move quickly in an area where we don't yet know what we really want. This is likely an area where we could do better. How should we ensure that we safeguard the priorities of creatives in a modality which is only of secondary importance to us?

However, I don't think that differences in modality are the most important differences between domains of human creativity. Even from a narrowly technical point of view, multi-modal models are receiving more and more attention. More fundamentally, the most important differences between domains of human creativity are to do with the human rather than the technological side. What I'll call the human ecology of creativity is the cluster of ways in which creatives interact with each other in different domains. We standardly think of poets, novelists and painters as exercising their creativity alone. Co-written novels are a tiny minority. Novelists may get help from editors, and painters may get help from curators, but editors and curators are imagined as playing a strictly subordinate role. This is enormously different from creation in which collaboration is constitutive. Orchestral performance involves a conductor and multiple musicians. Musical recordings may need multiple musicians and a producer. Plays may involve a director, multiple actors, and some other production roles. Films involve all these roles and much more: cinematographers, editors, etc. In these more complex ecologies of creativity we can't avoid speaking of creative collaboration. Another way to put it is that this is essentially distributed creativity.

RPGs and distributed creativity

At Mind Mage we take particular inspiration from the form of distributed creativity in table-top role-playing games. TTRPGs distribute creativity across three fundamental roles: that of the module author, who defines the shape of a narrative; that of the game master (at Mind Mage we refer to this role as that of storyteller), who conducts the playing out of that narrative; and finally that of players themselves, who exercise their creativity as role-playing characters within that narrative.

It would obviously be great if there were an inexhaustible supply of RPG storytellers available on demand, 24/7! But preparation to carry out that role takes time and effort, and so a common practical problem is that storytellers are in short supply. Storytellers are the vehicles by which authored games are delivered to the players who want to play them, and all too often there's just a shortage of vehicles. There certainly aren't enough storytellers to deliver RPGs to players at the radically increased scale which digital publishing allows. It's like that Simpsons episode where Homer asks, "Is this episode going on the air live?". As he's told, "No, Homer. Very few cartoons are broadcast live. It's a terrible strain on the animator's wrists."

This is where we at Mind Mage see generative AI as providing an opportunity. The technology we are building helps a game author to reach players, by using an LLM agent to act as storyteller, or act as a player to make up the numbers. That is clearly using a machine in one place which formerly required the human exercise of creativity. But it's not thereby diminishing regard for either the creativity of module authors, or that of the players themselves. In fact, it's using technology to connect authors to players, and allow their creativities to resonate with each other.

To do this well obviously requires that we pay attention to the creative needs on both sides. On the one hand, authors need to be able to reap the rewards of their creativity - rewards which are both financial and to do with their visible status as creators. Authors also need narrative tooling which equips them with the range of expressivity and control which matters to them. Delivering a narrative through automated storytelling must preserve the narrative features, and even the tonal or stylistic features of the game experience, which the author most cares about. On the other hand, players need to be able to enjoy the game as delivered - through immersion in the game environment, in the opportunities afforded by creatively fashioning their character's responses, and in the cumulative incident of narrative progression. This is still, after all, a situation of distributed creativity. Our role is to act as transmission mechanism between author and player, so that the conversational experience in which the player creatively participates is an experience which genuinely connects them to the author's prior act of creativity. If we get that right, we believe that players will feel the connection.

Safeguarding the Creativity of Authors

Substitutes and ComPlements, selection and combination

Domains of creativity: modalities and human ecologies

RPGs and distributed creativity

David Palfrey

Ready to Transform your Business with Little Effort Using Brightlane?

Beyond Turn-Taking: Reimagining Conversational AI for RPGs

Aligning AI Storytellers with players

75 Years of Games in AI