I didn’t realize it at first. While driving, I pinged Siri and asked her to get me directions to my destination. I would have typically gone the Google Maps route, but navigating an iPhone 6 – at times requiring two hands – while driving is a skill I’m not willing to test out in the wild. So Apple Maps it was.
I received a phone call while my maps session was ongoing. As I was talking, I heard a series of two-toned dings. One started high and ended low. And the other started low and ended high. Almost instinctively, I knew the high-low combination was informing me to turn left while the low-high combination wanted me to turn right. I don’t know exactly how I knew that, but I imagine it is something that Apple spent plenty of money to research on top of some sound music theory. It likely emulates the action of your turning signal – you go from low-to-high (push up) to signal right while high-to-low (push down) to signal left. That may have been a part of why I felt these tones were instinctual.
Compare and contrast that to the default behavior of Google Maps where the narration will speak to you giving explicit instructions to turn even if someone on the other end of the phone is talking. I’ve been in more than a few situations where that has been the cause of quite some confusion. That situation is also dangerous, especially while driving. Apple’s route minimizes the user experience of the Maps app on a contextual basis delivering need-to-know information in a manner that is as seamless as it can be. So we have it. Perhaps the one feature where Apple Maps outdid Google Maps. But that’s a topic for another day.
Before Siri, voice recognition was a concept of science fiction. The technology preceded Siri by decades, but it was rudimentary at best and unusable at worst. Voice recognition suites of the nineties were clunky and expensive. You needed hours training it on the subtle nuances of your voice. Post training, you would be lucky if the application’s success rate was 50%. And good luck with homophones and conjunctions versus plural. The software dictated what it heard – and that’s it. There was no special grammatical logic to inquire intent.
With cloud computing and access to high-speed Internet, voice recognition became a household standard. And while Siri and Cortana still make errors, the technology has come a long way. Instead of relying on the sole resources of a machine or device, the voice transcript is sent to the cloud and run through massive computational engines to discover intent and give it’s best shot at getting it right. While not perfect, it’s still incredibly useful – and will only get better. Each time you dictate to Siri, you’re teaching Siri your nuances but also providing another sample of data that voice engines can use for similar instances. We still train these voice recognition systems, it’s just much more subtle and even invisible to the causal user.
Touch v. Feel in UX
Everyone is familiar with touch devices. Nearly all of us have one and use one on a daily basis. We won’t discuss touch UX because most are familiar with the concept. But instead of touch, think of UX in terms of feel.
We are already used to vibrating patterns when using day-to-day applications. Muted phones can still deliver (mostly) non-audible alerts to users which has drastic implications on how the deaf go about using their devices in a productive manner. Vibrations are used throughout the touch device ecosystem delivering information when audio / visual cues aren’t enough. Games routinely use vibrating patterns to deliver information to the player. Nintendo debuted one of the first devices to use this type of force feedback with the introduction of the Rumble Pack accessory for the Nintendo 64. Playing Star Fox was never the same after that. Instead of playing a video game, you felt a video game.
Apple has secured a patent for pressure sensitive touch screens which brings another layer of sensory input and tactile feedback. Touch devices of today can understand a tap, swipe, and countless gestures. Adding pressure to the list will produce a multiplicative factor of each of these styles of input. A standard tap and a heavy tap would derive two different values. A standard swipe left and a heavy swipe left would perform two different functions. Swiping a message in the Mail app sends it to the archive while a pressurized swipe would delete the message permanently, for example.
Most touch screens roughly feel and look the same. If some Disney Researchers have their say, the flat glass of touch screens could give tactile feedback to the user mimicking textures by using electrovibration to generate electrostatic forces that create friction against your finger as it moves. Basically, your device would trick your finger into feeling something that isn’t actually there. While the research is out, it’s an interesting concept – and borderline magical.
You don’t have to dig too deep to find real world applications for this technology. If it were advanced enough, it could prove to be a handy device for those who are visually impaired or blind. Braille letters could be felt on the screen allowing the blind to use their touch devices in the same way that the rest of us do. The application goes much further and could provide a unique approach for the gaming industry by allowing sensory input from a deeper level. It could also be the answer to the touchscreen keyboard that everyone learned to live with by giving the sensation of actual keys, although something tells me we’re past the point of no return on this one.
Non Sensory (Virtual) Elements in UX
Finally we come to the user experience that isn’t defined by the core senses – interacting with thin air to create input or feedback from a system. In 2002, the motion picture Minority Report gave us a glimpse into a future that really wasn’t that far off. No, I’m not speaking of the concept of pre-crime, but rather, the then-impressive and awe inspiring electronic systems that John Anderton (Tom Cruise) would interact with.
Fast forward to 2015, this technology seems much less impressive. The first version of Nintendo’s Wii console system started this concept on a rudimentary level. Nonetheless, my first experience with a Wii was one of shock. Somehow, the Wii knew where I was pointing the control. I could interact with a game using real world motions like swinging a golf club or a tennis racket.
Microsoft’s Kinect took this concept to a whole new level and, essentially, brought Minority Report to living rooms everywhere. Multiple advanced cameras and a ton of software interpretation allows the Kinect to dissect and discern meaning from the tiniest of human movements, such as a heartbeat.
A user experiences an application with the basic senses where a person uses these senses to give and receive feedback. We experience with the senses leading with sight but not relegating solely to the visual elements. A user interacts with an application in an array of ways – vibrations, spoken words, touch, speech, dings, tings, and pings, and virtualized movements to name a few. When developing scope for a digital project, it’s important to take all of these elements into consideration to allow choices in how the user would want to interact with the application – as well as how the application interacts with them.