The Art of Interactive Design
My rather fragmented reading notes for Chris Crawford’s book The Art of Interactive Design.
- Start with the Verbs
- “This is the first and foremost rule of good interactive design, and the word rule is truer than guideline in this case.”
- Don’t Start with the Technology
- Be on Guard against the Egotistical Tendency to Speak
- “Most designers are egotists who would rather inundate the user with their brilliant expressions than actually let the user do something.”
- Keep It Fast
- “All software requires some combination of three fundamental resources: memory, execution time, and the sweat of the programmers. You can always reduce one of these three by increasing the allocation of the other two.”
- Organize Verbs into Manageable Groups
- Prioritize Verbs by Frequency of Use
- Be Square
- Spatially (visual arrangements) and temporarily (e.g., the division of a game into levels)
- Design the Loop as a Whole
- Don’t Set Up False Expectations
- Say What You Mean
- Speak with One Voice in Five Tones
- Primary Data Windows
- Progress Reports
- “I Screwed Up”
- “I Can’t Handle That”
- “I Need More Information”
- Don’t Describe the Problem—Offer the Solution
- Offer to substitute a font for one that wasn’t found
Any human face we put on our software will surely be a fake, a mask that could all too easily slip off, revealing the true nature of the software in all its ugliness. We won’t get away with saying, “Pay no attention to the machine behind the curtain!” The customers will see through our pretension, and that can only hurt our relationship with them.
My solution may seem insanely contradictory: pseudo-anthropomorphization. We present our users with “characters” possessing some small degree of humanity, but the characters are represented by imagery that clearly communicates the limited nature of that personality.
[Apple’s video assistant help] approach backfires badly when the speaking part of the agent is much better developed than the listening and thinking parts.
If you’re going to create an agent, then by God, create an agent—not a multimedia user manual. Endow it with ears and a brain! If you can give it only tricycle ears and a tricycle brain, then don’t give it a Formula One face and turbocharged talking.
If your agent’s listening and thinking are lousy, then use a stick man squeaking in a cartoonish voice. Better yet, emphasize his stupidity by using text instead of voice synthesis, and write the text in third-grade English. Microsoft was on the right track using a paper clip as an agent; the highly cartoonish nature of the image suggests the low level of intelligence in the agent. Unfortunately, the level of intelligence that they gave Mr. Clippie is even lower than what we would expect of a talking paperclip, so the end result is still displeasing to many users.
- Use First and Second Person and Active Voice
- Be Just as Courteous as You Would Be in Public
- Use Normal English, Not Your Own Terminology
- Don’t Feign Infallibility
- “Software is always quick to tell you what you did wrong, but it never seems to admit the possibility that it did anything wrong.”
- ‘The “I Screwed Up” and “I Can’t Handle That” standard messages presented in Chapter 8 are ideal for this task.’ In the real world, a person who always blames everybody else and never accepts responsibility for his own mistakes is quickly ostracized.
- (Apply Impro’s status analysis to software messages? —D.)
The serious point for interactivity designers is that we must never underestimate the intensity of commitment that our users hold for the more established habits of their computing lives. Don’t ever—ever!—mess with the dynamics of the user–mouse interaction. It doesn’t matter if your method is superior; you’re up against instinct, not reason. Don’t push your luck.
One of the deepest and most fundamental polarities in the universe is that of entities versus operations.
- To express this polarity in terms most appropriate to interactivity design, I use the concept of process intensity versus data intensity. Process intensity is the degree to which a program emphasizes processes instead of data. All programs use a mix of process and data. Process is reflected in algorithms, equations, and (to a lesser degree) branches. Data is reflected in data tables, images, sounds, and text. A process-intensive program spends most of its time crunching numbers; a data-intensive program spends most of its time moving bits around.
- The same is true with games: the higher the crunch-per-bit ratio, the more “computery” the game is and the more likely the game will be entertaining.
- The crunch-per-bit criterion also works well in the negative sense as an exposer of bad software ideas.
Trees generate a geometric explosion in the number of nodes required.
The general solution to this problem is a linkmesh, a tree with loopback links and state variables. […] A linkmesh is a tree with two crucial additions: reverse flow and state variables.
Programs should recognized well-defined patterns of input (and how the current particular case is different) and interrupt the user to offer to complete the recognized task. The user should only need to supply the distinguishing data. How intrusive this agent is should be adjustable.
Play is fundamental to interactivity. It is the original educational technology, dating back millions of years. Despite our protestations of deadly seriousness, play pervades much of our culture. The most important rule emerging from an understanding of play is the requirement of safety. Play has two major ingredients: agon (competitiveness) and paidaia (frolic).
The first term refers to play as a competitive activity, a deadly serious pursuit within constraining rules; the second emphasizes play as a joyful activity.
Our brains treat interactivity and play as the same thing. “Play is what happens in a serious application.” The user learning and deciding whether to commit to your program is intrinsically playful.
Adult play is surely deeper and more subtle, takes more effort, and covers a wider variety of topics than child play. Consider these examples:
- The owner of a small business putting nice creative touches on her company newsletter
- The corporate drone dedicating long hours to the creation of a multimedia extravaganza for a meeting presentation
- The market researcher fiddling about with the customer database to find odd combinations of customer types
- The news reporter carrying out background research on the web, getting carried away following fascinating but marginally relevant threads
When barristers and judges wear wigs in Britain, they are playing. Costume is a sign of play. This doesn’t mean childishness but rather them isolating themselves from the real world and creating another self-contained world of justice. “[A] court of law shares much with a basketball court or a tennis court.”
Play requires safety for the players: physical, social, financial. Privacy provides safety from social dangers when trying a new skill. One of the secrets behind the success of the personal computer is that people can use it without the embarrassment of a teaching session.
The trick, then, to motivating your users in the most productive direction is to engage their playfulness. Your highest priority is to encourage a sense of playful experimentation. […] Your familiarity with the design blinds you to the imagined dangers the users fear[.]
“The Macintosh is just a toy.” […] They readily acknowledge its playful character and hold this playfulness against it.
A playful design philosophy will certainly indulge in occasional cuteness, although cuteness is neither a necessary nor a sufficient condition for playfulness.
Don’t Chastise Your User
The best approach is to render error conditions conceptually impossible rather than merely technically impossible. Here’s a simple example: You decide to permit your user to assign up to eight names to some item in your design. The worst implementation, as I have already noted, is to chastise the user upon input of the ninth name. Slightly better is to dim the control that adds another name when eight have already been entered—but this leaves the user wondering why the control is dimmed. Better still is to put the names into a display box that can hold only eight names, along with a provision to enter a new name by clicking in the empty space. Gee, if there’s no empty space left, there must not be any possibility of entering more names.
Everything Must Be Completely Undo-able
All Experiments Must Yield Clear Results
The user needs to know whether the experimental behavior accomplished anything, so you must acknowledge the experiment.
This feature is simple to implement; when you process the user’s input, you attempt to recognize every input as meaningful. If you come across an input that makes no sense, issue a simple response.
The Dark Side of Play
“Interaction can take place only where there is a perceived discrepancy of volition.” As a child throws a ball, the child figures, through experimentation, the laws of physics that determine the ball’s behavior. The child stops seeing the ball as an agent with free will and starts seeing it as an inanimate object subject to laws of nature. The ball is abandoned, and the child seeks out new agents to play with.
This explains the old and continuing failure to design successful cooperative games. Play thrives on the noncooperative elements, and withers where there is no discrepancy of volition.
This “blood and iron” philosophy of play may strike some as cynical, but I see no pessimism about human nature in it. The ugliness arises from two derivative phenomena.
“The absence of discrepancy of volition destroys interaction.” does not logically lead to this conclusion: “Greater discrepancy of volition yields greater interaction.”
Greater discrepancy of volition yields more intense interaction, but this isn’t higher quality. Greater amplitude does not make music better.
But my main point here concerns a much more subtle error, one that pervades our civilization. It is the justification of agon (competitive play) through paidaia (joyful play).
The history of civilization shows a trend towards the substitution of agonistic play for conflict.
And here is where we encounter the dark side of play. Somehow, agon gets mixed up with paidaia. Lawyers think that they are playing. They’re not in it for the richness of the interaction; they’re in it for the joy of victory.
As systems grow bigger and more complex, they evolve more abstract structures to cope with the increasing complexity. Many designers attempt to improve an existing design by adding greater complexity, which yields “humongous heap” design.
A more productive alternative would be to concentrate on the level of abstraction of the design rather than the amount of complexity. Greater complexity emerges automatically from higher abstraction; therefore, abstraction should drive the design process, not complexity.
I would seek some more abstract way of conceiving a document. Perhaps I would think of it as a communication rather than a document. Perhaps instead of visualizing words on a page, I might think in terms of organization of ideas—this would yield a word processor based on an outliner.
Indirection substitutes a convenient indirector for the real thing. That indirector can represent the referent, substitute for it, or point to it.
The crucial component of any indirection scheme is the construct, a scheme for getting to the referent from the indirector. The construct for dollar bills used to be quite simple: I could walk into any bank and demand that they replace my dollars with gold or silver.
A construct “carries” an indirector across some “distance” to a destination, where the referent is somehow reconstructed.
Factor the output into combinable components.
Some Possible Approaches to Language Design
- Inverse parsers
- Trash bin
- Facial feature extraction
The brain is not an open-minded observer of reality: it insists on jamming everything it sees into one of its preconceived patterns, because pattern recognition is the fundamental technique by which neural circuitry works. […] We never see the world as it truly is; instead, we see it solely in terms of the patterns that we expect and are able to recognize. Every image we perceive is ruthlessly broken up into a set of features, and then those features are matched against previously established pattern templates until a match is found.
- List the features that constitute the feature set of the thing you wish to describe with a metaphor.
- List all objects whose feature sets contain many features in common with your first list.
- Compare each candidate’s feature set with the metaphoree’s feature set, noting features that match and features that don’t match.
- Isolate the mismatches.
- [U]se some of the matched features and ignore or negate each of the mismatches.
If the metaphor was competently designed in the first place, it may have little room for extension.
[Another] danger arises from the temptation to extrapolate rather than extend a metaphor. A beautiful example of this problem is provided by the attempts to create three- dimensional desktops.
Personal anticipation must be figured out from interaction with the user.
A truly idiotic solution to this problem is the “startup wizard” who quizzes the user when the program is first launched, asking for all sorts of detailed information that the software uses to custom tailor the software. The problem with this approach is that, all too often, personal preferences can’t be developed until after you’ve had the opportunity to use the software. By that time, the startup wizard is gone, and there’s no way to figure out how to get him back.
The biggest hurdle we must overcome with anticipation is our pantywaist fear of relying on less-than-ironclad means of anticipating the user’s desires. (Example: always showing the print dialog.)
Levels of Anticipation
- Followup questions
Your biggest problem in using anticipation in software design is your exposure to blame. If your software insists on pig-headedly covering its ass at every turn, nobody can ever blame you for screwing up.
The History of Interactivity
Growth of mass media, therefore the fall of interactivity.
Control versus Interactivity
[H]ave we learned from the many failed attempts at interactive storytelling? What is the precise nature of these failures? I would argue that all past efforts in this direction have not been interactive storytelling, but rather “interactivized stories” or “storyized games.” At a structural level, they are not storytelling; they are stories. The difference here is profound: it is the difference between the process of storytelling and the result of storytelling (a story). Storytelling is not the same thing as a story: storytelling is an activity, a process, while stories are collections of facts, data. You can’t interact with data—you can interact only with processes.
A number of researchers have approached interactive storytelling as a problem of simulating characters. I used this approach in 1987 with Siboot and concluded that it led nowhere. Its problem is its failure to focus on the verbs.
The heart of my technology is the storytelling engine. The engine’s basic task is to execute verbs. Each verb can lead to another verb, and so on, generating a long sequence of events: a story. The verbs are created and specified by the author. Each verb can generate a number of options for other actors. Which options are available and the rules by which those actors choose among the options are again specified by the author. The human protagonist is given control of one actor and makes the choices for that actor.
A storybuilder first creates actors, each with a name and a number of personality traits, the the stages for of the drama with rules of access, then props. Their main task is the the creation of a large set of verbs. A good story requires a least a thousand. Each verb must have a set of roles, slot for actors.
For example, if Puncher punches Punchee, then one role might be Punchee’s Best Friend, who presumably would come to the Punchee’s defense. Another role might be Punchee’s Girlfriend, whom we would expect to scream, rush to console Punchee, and possibly hurl a few epithets at Puncher. There could also be Intervening Bystander, who might step if the fight goes too far. We could even have Timid Bartender, who might want to duck behind the bar or call the cops. Whenever a verb is executed, each of the witnessing actors consults each of the roles, asking whether they fit that role. If the actor does fit the role, then he executes it, deciding which of the role’s responses to choose.
Circuits for direct stimulus-response relationships are easy. Circuits that process a sequence of bits are more difficult. Similarly, evolution: from direct resonses, to sequential thinking, to subjunctive.
Seven Lessons to Remember
- Your software engages your user in a conversation. Your design task is to maximize the utility of that conversation.
- Think about that conversation in linguistic terms.
- What are the verbs? What does the user DO?
- Speak less, listen more.
- Thinking is the delivered content of all software.
- Your software should do whatever a reasonable person in its situation would do.
- Dactylodeiktous means “all fingers pointing at.”
Back to index: Notes.