Action-centered explanatory media

This is a game night in the post-COVID world of board games. You and a few friends gather around a table to try a new game.

You pick up the rulebook and start reading it aloud to everyone, but after a few minutes, everyone becomes restless. So you think, why not just start and read as you go? The first round takes a long time because you keep going back to the rulebook every ten seconds; you struggle to remember what you "should do"; and it's hard to emotionally engage in the game when you know you'll have to consult the rulebook in a moment.

Maybe we can replace playing a new game with watching a movie?

I've written a lot about a problem with reading: we often quickly forget what we've read, except for the main points. So here's another problem with books, especially those aimed at helping you learn a skill: they create a distance between words and action. Books rarely involve doing what they're supposed to help you do. Most books, even those aimed at skill-building, are just about what they talk about. Reading a book about board games instead of playing board games.

If you ask people about the periods in their lives when they grew the most, you'll notice that the most favorable environments often involve action. A summer spent preparing for an upcoming competition with your team; a failed startup that taught you valuable lessons; taking on the challenge of writing a new song every day for a month; a weeks-long meditation retreat; an immersive apprenticeship; and so on. In these stories, books are sometimes a source of knowledge, but they are often secondary to great mentors, teammates, context, and motivation. The key is to take action.

Similar to our board game problem: consider playing a board game for the first time, but with people who have played before. This is a completely different experience from what I previously called a "cold start." In this case, experienced players might give you a brief introduction - not overwhelming you with memorization - and then you can start playing. They'll set up the board themselves or tell others to shuffle and deal cards. They might say, "Let me demonstrate first. I'll start by drawing two cards, then I can choose to move here or play this card. I'll move here, which will block John from moving into this open area. Now it's your turn. Your goal is to go to XYZ, and you might start moving from this side. Now, if you draw an action card, you can take immediate action if you want; otherwise, read what's on the card..." As the game progresses, they might continue explaining, narrating what you need to know, explaining options you might consider, providing feedback. As long as your experienced friend has enough grace to avoid falling into the realm of overbearing Clippy, it's a more enjoyable and effective way to learn the game.

This may have always been a good way to learn board games, but skill-building books have many practical advantages. Consider the information density. If you want to learn to program for a quantum computer, you need to absorb a lot of material before you can "act" on your own. It might require dozens of hours of an experienced companion's time to explain, which quickly becomes burdensome for most people. Explanatory text may lack personalization and interpersonal connection, but it can be more carefully crafted; it doesn't tire and is always ready; it can embed graphics and abstract symbols; it can be consumed non-linearly; it can be read faster than spoken language; and so on. Perhaps most importantly, it is a mass medium. The world's deepest experts and most astute communicators can write a book on a subject that millions of people can hold at almost zero marginal cost.

So, how do we create a mass medium that has the advantages of books and is situated in practice? How do we create an explanatory mass medium that feels more like playing a board game with an experienced friend than playing a board game while fiddling with a rulebook?

The role of dynamic mediums
One reason it's hard to create a book that is in action is that books are static, fixed. As a reader, you have to transfer the words into an environment where you can "act." Even then, there is rarely an opportunity for interaction between action and words. Authors can suggest how to reflect on exercises and generate your own feedback, but these are scripts you have to execute in your own mind. Videos haven't fundamentally changed this situation. But the prospects of computers and the dynamic mediums they enable are capable of exhibiting and responding.

We are often urged to leverage this feature to integrate simulation environments. Perhaps biology textbooks could embed a simulated petri dish that you can "do" certain types of cellular biology experiments with, reducing the distance between words and action.

But at least in the cases where it's possible, I'm more interested in what happens when the computational environment becomes a real (not simulated, not "educational") execution environment. Nonlinear video editing interfaces aren't "toys" for editing movies; they are how professional filmmakers actually edit movies. Mathematica isn't a "toy" for manipulating symbolic expressions; it's how certain mathematical work is best done. So a dynamic "book" about video editing wouldn't need to include a "toy" environment, a simulated petri dish. Instead, it would put itself in the same environment used to edit the best movies made on Earth.

But what does it mean for an explanatory medium to be "placed" in such a real environment? How does explanatory content interact with environmental content?

In the past decade, authors and programmers have written dozens of interactive articles that might tell us the answer (see Communicating with Interactive Articles). Personally, I find the work in this field very inspiring. But I don't know of any articles that perfectly align with the desires we're discussing here. These articles might be interactive - involving some actual doing - but those actions take place in a sandbox specially constructed for the purpose, not in an actual environment for deploying the skill being built. They are integrated with simulated petri dishes, not actual lab benches.

For example, an article on a programming topic might be a long text interspersed with a small interactive code editor that you can use to explore a concept. This is certainly an improvement over a typical paper! But relative to our desires, the "doing" here is a secondary activity. These editors don't feel much like actual programming environments; you have to overcome significant barriers to apply anything you do to an actual program. It's a bit like reading a board game instruction manual with interactive pictures that describe a simplified part of the board game. Or, to be more unfair: it's a bit like a carefully crafted pop-up book. You still aren't really in action.

If the structure of these interactive articles is typically one of containing lots of interactive text, integrating a real environment might involve reversing that structure. Taking programming as an example: could we move the entire experience into the IDE of your choice, while still somehow presenting explanatory text rather than a text file with embedded source code listings? Could we move the courses on YouTube about 3D modeling with Blender into Blender itself?

Video games excel at this. Sometimes tutorials appear in non-interactive cutscenes, which are quite different from regular gameplay, but better examples present guidance and narration as seamless elements within the interactive environment, without taking the camera or control away from the player. The result is a rich sense of immersion in the game environment - a stark contrast to the instruction manual for a board game.

Video games also improve on another problem with interactive articles: the separation of text and dynamic presentation. These articles close the distance between text and interactive elements, for example, by linking numbers in the text to parameters that can be directly manipulated in graphics. But in most cases, they are still physically separate, not visually integrated. The reader's eyes jump back and forth between text and interactive elements, stirring working memory with additional objects. This is not just a problem with dynamic elements. The same problem exists with static graphics in traditional text. But the solutions described in Edward Tufte's books are rarely applied to the dynamic realm, perhaps because the authoring tools are more complex and isolated. Almost everyone is still "separated by production mode."

Video games use the advantage of audio to layer instructions onto what the player sees, but even with just text, it's possible to place text alongside relevant parts of the action. This arrangement allows the game to avoid confusion between instructions and interaction, as we experience with reading text. Because the narrative communication is integrated into a dynamic environment, it can behave and respond like other elements of the environment. In great games, the storytelling feels like an ongoing response to the player's actions. This breaks down the sense of distance we have when reading text from "doing."

Entering the Figma document
(If you're not familiar, Figma is roughly a collaborative tool for visual representation of software interfaces.)

For the past two years, I've been collecting notes on this topic. At some point, I wanted to build prototypes around these ideas.

Figma changed the way copying and pasting works in software interfaces. Figma made a document explaining this change to users. I know this sounds very mundane. Hear me out. I encourage you to take a look for yourself before continuing: click the "Copy" button to start, then zoom in on the frame in the top left corner. You can explore this document with a free account in your web browser.

The document initially reads like a hypertext slideshow. It intuitively demonstrates the behavior of the new clipboard feature. It's a document about using Figma, created and used within Figma. In the midst of explanatory text, the document hands control over to you.

This is where we break down the wall between self-authored material and real "doing." You are invited to manipulate the objects you create. These objects are not special; they are the same "kind" of objects you could create on your own elsewhere. You can copy and paste them into a new document. As you manipulate these objects, you're using the same tools used to create the document. More importantly: you're using the same tools you use in your own creative work in this space.

This could have been a blog post with interactive "demonstration" areas interspersed between paragraphs. But instead, you're interacting with this document in the full Figma environment. In the design realm, this is a workbench, not a simulated petri dish. In addition to text, the "doing" here and in your own creative work is indistinguishable. While reading this document, I found myself curious about how the paste behavior would work if the group structure were different, so I just set it up using the tools I already knew and answered the question for myself. Later that day, I found myself - without even really thinking - using the new paste behavior in my own layout design.

It's important to note that there is a lot of text in this document. That's part of what makes it so interesting. "Worksheets" aren't so unusual - there are many Figma documents that give you an exercise and some background.

In contrast, copying/pasting a Figma document is unusual because it has about a thousand words. There is a lot of explanatory material here. The potential for expansive, in-depth documents opens up possibilities for the medium to be exemplary. Additionally, in the copying/pasting document, the interaction between self-authored material and action is much more refined. This creates shorter, more precise feedback loops, closer to the experience of a video game tutorial or playing a board game with an experienced friend.

This is a document about using Figma. As a well-crafted instruction manual, it's useful. But we can imagine a more significant change: an "textbook" about interface design written in Figma. In this introductory book, you're not just reading about how to design. You're actually doing design, and doing it in the same environment used by professionals. Because there is no artificial boundary between explanation and action, I believe such a book can come closer to the feeling people have when playing a well-designed video game tutorial.

Interpretation of the Figma document concept
Of course, the "meta-document" medium of Figma itself can go further.

The explanatory text in the copy/paste document doesn't have behavior and response. It's not a truly dynamic medium for the user. The fine-grained communication between explanation and action makes it easier to help readers generate their own feedback, but this medium can go further by responding to the actions readers take. Earth, a Primer demonstrates a simple way to "check" when you complete a suggestion:

We can also imagine computational elements specific to certain topics. A chapter on color theory could use a representation with linked colors to visualize complementary colors and assistive colors in response to your choice of a "main" color. A chapter on grid systems could help you intuitively understand how different choices of font hierarchy proportions affect the baseline rhythm in your design. A section on accessibility could embed a contrast table into your design canvas. And so on.

Another interesting direction - also well-suited to Figma's multiplayer game-like nature - is to integrate collaborative learning opportunities into the text. A standard "action" in collaborative learning is to introduce a problem that can be solved in several different ways; list several contrasting student solutions to help students learn from each other's ideas. A Figma design "textbook" could include both individual and "shared" boards, along with potential asynchronous coordination features to facilitate idea exchange among students. This approach could even include a facilitator and incorporate tools like Desmos to support their work separately.

More broadly, the interactive explanatory medium I'm proposing is not limited to "educational" scenarios. It could also be quite useful in the process of working on meaningful projects. For example, if we're designing for a new operating system and I've invented some new controls, I can show them to the team not just with a static Figma document but by "doing" to help you understand how to use them in your own design. Such documents could also serve as timely professional references: if I'm designing an extended color gamut display for the first time, something like this would come in handy. The "entry points" in these documents could allow you to "bring" your own design project into the explanatory document, so you can understand concepts in the context of actual work.

Expanding to other environments
Figma is certainly not the only environment that can support this kind of format. Where else can we instantiate something similar? What qualities must a system have to realize such a document? I can at least summarize a few aspects.

The document must provide authors with some way to communicate explanatory content, and it may be lengthy. Figma has text elements; programming environments have comments. In other environments, such as audio production tools, we may have to add this capability to the document model. Alternatively, explanatory content can be delivered through audio (or video) channels, but it must ensure that readers can interact with objects in the document without interrupting the playback of the author's material.

Authors must be able to establish relationships between paragraphs of their content and corresponding elements in the reader's interactive environment. In Figma, we can place text directly on top of each element you manipulate. In Finale (a music composition environment), text can be interleaved between scores or positioned above specific phrases. In a Roam-based medium, the hierarchical structure of outlines can be used to connect author text and reader text.

Authors need linear and structured ways of explaining; readers need visibility of "navigation." In Figma, boards are arranged in a sequence that can be easily navigated through scrolling or keyboard shortcuts. Hyperlinks and "table of contents" boards support navigation. The layered list of layers provides a persistent directory. In programming environments, a sequence of tabs might approximate Figma's boards. Alternatively, we can improve these capabilities with special features, as in Natto's tutorials.

In what other environments can we easily imagine creating such documents?

A HyperCard document about writing a well-crafted choose-your-own-adventure game.
A game development tutorial presented in the form of Unreal Engine documentation.
A harmonic analysis seminar presented in the form of an extremely complex Finale score.
Adapting "How to Take Smart Notes" into a Roam graph.
Here, I've limited myself to documents that can be created within existing environments. But instead of limiting ourselves to existing affordances, perhaps one day we will design systems like Figma that better support this kind of document.