I can see into the immediate future. It will consist of millions of people standing around looking at their phones. On their phone screens will be information regarding what they're looking at, and based on what verbs are provided to them, they will either be smiling in delight and pressing the appropriate button overlaid on reality, or cursing in great frustration, wishing doom upon the designers who tried to cram too much functionality into the next great augmented reality application.
While reading a recent article by Luke Wroblewski about the benefits of creating first-person interfaces, I had a realization that one of the most intriguing challenges we're going to have as interactive designers is dealing with verbs when designing first-person interfaces.
Usually when we think of verbs, we're trying to determine what labels best fit a button in an interactive design. Should the call to action say "Buy" or "Check Out" or "Submit"?
At a higher level, there's another layer of verbs within the system: the defined set of actions that you can initiate when dealing with any specific object that can be pressed. We've all had the experience of right-clicking an element in a program like Photoshop and popping up a menu that affords us a range of possible actions. Each of those actions is a verb—cut, copy, paste, and so forth. These are the verbs that can be applied to the point in the application my cursor is hovering upon (the object).
In using desktop software and more highly evolved web applications, there's another layer of complexity: the verbs that are revealed after a specified interaction with a group of objects, such as a selection. In these cases, objects need to be grouped or highlighted to provide another layer of verbs. Files and other desktop elements present different options based on their selection state.
Now, leave the controlled world of the desktop and enter into the real world as our interface. Suddenly, all the physical objects that surround us are treated like objects on a drag-and-drop desktop—but the affordances we're allowed in manipulating those objects is practically infinite, as well as the number of potential verbs that may be utilized as a user interacts with them.
So, we provide tightly defined limits for what the user can accomplish. Yelp's Monocle is functional because it's only providing only one layer of information that a user can act upon. In fact, most augmented reality apps as they're presented today treat reality as a substrate that allows a highly limited set of verbs to be applied to it.
The real challenges for our profession will begin, however, when we try to overlay and manage in real time more than one limited set of verbs for how elements are grouped in reality. As layer upon layer of verbs are added to your feature set, the level of complexity for how it's managed in the interface goes up by an order of magnitude—both for the user (cognitively) and for the system (in data presentation and context management). We will be crushed by the laws of physics.
Layar is a great example of this problem in action. The user is given the abilty to dial or filter what information is provided to them in the interface... but if you have everything shown to you at once, it risks becoming a firehose.
I dread opening an application designed to help me accomplish real-world tasks and discover useful data while being brutalized with dozens of features, expressed through multiple sets of verbs. After this brief period of novelty, users will clamor for convergence, as we can only hold so many dozens of apps on each of our devices. Users will begin to expect greater context from verbs provided in augmented reality, with the system providing back an extra layer of intelligence about what needs to be presented based on your point of view. And what scares me most is the branding that will likely occur through an inevitable augmented reality land grab. Open up your Facebook app to gain access to the status of those people who are close to you, but be prepared for the commercials pointing out what Facebook thinks that you'll enjoy based on your interests and your network. I can already imagine the virtual communities we'll begin creating layered content on top of our physical world, and it's likely to be commercialized just like Second Life and the Internet as we know it today.
Those who live in a world of Web pages rarely have to deal with object-verb issues. But as Web-based content begins to be reused as part of augmenting our current reality, solving these kinds of verb-object issues on a regular basis will become one of the primary design challenges we'll face. I could imagine that augmented reality becomes a second cousin to duplicated reality, where we can create our own rules for how things are organized, and that is the space where we expend our design energy, more so than trying to continually overlap the real world...

Comments