True Interactivity Requires Artificial Intelligence

--

Follow-up to A Purpose-Driven User Experience

Computers are not truly interactive and it’s by design. They were never meant to be.

Our first machines were conceived as empowering tools, to be fully mastered and controlled. As our tools became more complex with the advent of computers, full control became elusive and using such machines became unwieldy. To solve this issue, designers everywhere used the same trick: constrain the user’s behavior when they interact with a machine. Think about it. Do you feel empowered when you use an ATM? Instead of the user being in control, they must follow an unchanging script dictated by the machine. Their only option is to complete the script or to abort it.

Interacting with computers feels like nothing else in the world and certainly not like interacting with other humans. The computers make us subservient. They want us compliant and predictable. They scold us when we’re not following their arbitrary and often hidden rules.

However, after fifty years of perpetuating such behaviors, computers are now powerful enough to become true collaborative partners with humans and radically transform the way we think about interactivity. Artificial Intelligence and new types of interface are the keys to this transformation.

What is Interactivity?

In The Design of Everyday Things, Donald A. Norman explains the process by which an actor can act upon the world (or any interactive system). It takes the form of a cyclical process made of eight steps:

  • The actor forms a goal (e.g. I’d like to eat some Chinese food)
  • They form an intention in order to reach that goal (e.g. among the many ways I could choose to satisfy my craving for Chinese food, I decide to go to a Chinese restaurant)
  • They make a plan (i.e. a sequence of actions) in order to fulfill this intention (e.g. I’ll pick a nearby Chinese restaurant on Yelp, make sure I have enough money, use my car to get there, etc.)
  • They execute the next action in this plan (e.g. I drive to the restaurant)
  • < here, the world / interactive system reacts to the action or changes on its own > (e.g. traffic, holding a table if I booked one, etc.)
  • The actor perceives the new state of the system or acquires new information through their senses (e.g. when I arrive, I can see that the restaurant is actually closed)
  • They interpret the state of the system and update their model of the system (e.g. I make a mental note that this restaurant is closed on the week-end and that I can’t depend on Yelp to provide me with this information)
  • They evaluate the outcome and the need for a new goal, i.e. they test the ongoing validity of the current goal (e.g. do I still want to try to get Chinese food? Would I be satisfied with a nearby open restaurant instead?)

Trending AI Articles:

1. TensorFlow Object Detection API tutorial

2. Deep Learning Book Notes, Chapter 1

3. Deep Learning Book Notes, Chapter 2

This cycle repeats with each change of state in the system or the actor. Some steps can be ignored along the way — e.g. I don’t have to reassess my desire for Chinese food with each turn I make in my car — and the cycle can start with any step — e.g. I can get the intention of purchasing something from seeing an ad — but that’s the general principle.

In interactions between two humans, each person is both an actor and an interactive system for the other. What is different from an interaction with a computer is that humans constantly model each other and adjust their behaviors according to their understanding of what the other is trying to accomplish, as well as of how and why they want to do it. That’s how you can have collaboration, negotiation and relationships.

Computers, on the other hand, make no attempt to model users and don’t adjust their behavior. They expect users to manage their goals, intentions and plans themselves and only communicate with the system through actions. Users must follow the computer’s scripts to the best of their understanding until they get what they wanted out of it. So, instead of having a rich loop of interaction made of 14 steps that could describe everything that matters to the two actors, we get a much-reduced loop devoid of any meaning and emotion-related steps. Computers turn us into automata.

Ideal interactive system vs. Today’s reality

Content creation software like Photoshop or Word are a little bit more humane as they provide their users with very granular actions (changing a pixel or typing a character), letting them plan their actions to a greater extent, at the cost of a vastly increased complexity — just look at the command menus and tool palettes in such programs if you need convincing.

We can do much better!

The Interactive System as an Actor

What would it mean to truly interact with a system, to have it be a fully realized actor?

Powered by artificial intelligence that would let it model the user, such a system could bring a higher level of purpose to each of the step of the interaction loop:

  • Senses: the system would observe the user’s actions and extract meaning from them.
  • Model: the system would understand who the user is, what they know, what they want and be able to guess their mental state.
  • Outcome: the system would track the user’s progress toward their goal. Do they need help?
  • Goal: the primary goal of an AI-powered interactive system is to help its users to the best of its abilities so that they can accomplish their own goals.
  • Intention: Higher AI, the decision-making part of the system. The system would know how to inform and guide the user or adapt the interaction protocols so as to make reaching their goal easier.
  • Plan: Lower AI, optimization, planning, computation, execution. The system would know how to best use its own features toward achieving the user’s goal.
  • Action: the system would clearly and purposefully communicate internal changes to the user so as to help them change their model of the system when needed.

Verbs are the Wrong Metaphor for Interactivity

Copy, paste, save, share, close… These performative verbs are our only tools for translating our plans, intentions and goals in terms that a computer can understand, and they’re pretty bad at it as we must do most of the work ourselves in our heads. True interactivity demands a new vocabulary, one with which the user can discuss the topics related to their purpose and the process of interacting itself. Each step of the interaction loop covers its own set of topics. The more we move back in the interaction loop and away from actions, the more meaningful and useful our communication with the AI actor can be:

  • Plan: the user can explain their methodology, the rules they follow and talk about their next actions.
  • Intention: the user can explain what they want to do and why.
  • Goal: the user can talk about what they want to get out of the interaction and why it’s important to them.
  • Outcome: the user can express what would make the interaction a success and even negotiate with the system if a desired state cannot be reached and compromises must be made.
  • Model: the user can talk about how he thinks the system works or in what state it is. This is very useful to resolve misunderstandings on both sides.
  • Senses: the user can reflect on the interaction protocol itself and give feedback to the system about its outputs (e.g. “I don’t understand what this status means”, “what’s the difference between these two commands?”).

For such a vocabulary to have the necessary expressiveness, the system must understand the concepts it evokes. This means that it must have an internal representation of what it can do and be able to reason on this knowledge in order to optimize its work. Furthermore, it should understand the user well enough to translate their goals, intentions and so on in terms that make sense in its domain of expertise. In short, a system should be an expert at collaborating at its appointed task.

Interactivity for Augmentation

The combination of representing domain knowledges, reasoning on them and establishing collaboration protocols is the purview of the artificial intelligence’s branch dedicated to augmenting the human intellect, instead of automating processes opaquely. The application of this kind of AI to interactivity would transform our relationship with computers, turning them from rigid authorities into powerful and understanding helpers. Putting understanding at the heart of the interactive process lets us define this new type of relationship:

Interactivity: The process by which two actors build models of one another in order to influence each other’s behavior toward their respective goals.

And remember that the primary goal of an AI-powered interactive system is to help its users to the best of its abilities so that they can accomplish their own goals. AI can turn the interaction loop into a virtuous circle of efficiency and empowerment.

Until now, computers have treated us like machines because the cost of interaction failure was too high for their designers — it was easier to constrain users than to make sure that a “smart” program didn’t crash. Now that we have the technical means of treating the users as humans, of understanding them and of augmenting their capacities, it’s the cost of not doing so that is too high.

Stéphane Bura is available for consulting on projects at the intersection of AI and UX. He writes about how technology can help us build better communities at Augmented Communities.

--

--