Voice tool

How can conversational interfaces increase the user experience? This is not an easy question because it really depends on how you see it. The conversational experience that Joaquin Phoenix has with his Operating System in the movie “Her” is still far away. We are currently still in the functional domain and are trying to figure out what the functional boundaries are. The emotional nuances are being explored, but this is not yet the time to fall in love with your digital assistant. We are still in the “interface”-stage where talking is a focussed on getting things done. The “interface” could be considered a facilitator for Human - computer interaction. It’s a tool.

Her
Her

Keyboard
Keyboard
If we look back at the evolution of input mechanisms, we could compare where we are right now with the traditional keyboard input back in the day; The interaction has a very instructional nature. The computer displays it’s current state through a display and makes ome occasional sounds, but that was pretty much it. It’s not really intuitive, the feedback is very basic and slow and the brand presence is still underdeveloped.

The first Apple mouse
The first Apple mouse
We are at the brink of the next phase where the possibilities of applying it in a effective way will soon see the light of day. Just like the mouse input emerged when the possibilities of the machine grew, will the voice interface become more mature. Voice interface will no longer be about telling the computer what to do. New feature will require new ways to interact with it; the technique will drive the need.

Touch the screen
Touch the screen
The feedback part gradually became became more important after the touch screen was introduced. You were no directly physically interacting with the computer and this spread out so quick that you expect that every screen is a touch screen display. This will happen with voice as input device as well. Talking to the computer will be the main way to handle the tool.

So when you look at it from an interface perspective; the voice element is just another new way to interact with your tool (your computer). Your could define voice, as the next form of human - computer interaction. But it’s more than that! The computer is far more than just a tool; it’s the most important device you have; your whole life, your business, your shopping, your finance and your social life is manage by this machine. And not just your life, but that of everyone, and that gigantic pile of data makes voice more than just a new way to interact with your tool. Together with all this data an intelligence this tool becomes alive and haas the potential to truly become your assistant; That’s why all these major companies invest in it.

Cortana, Google Assistant, Alexa and Siri all have their own little nuances, but they all are focussing on understanding their data and finding out how to make this valuable and fit into our lives in a constructive way.

Cortana, Google Assistant, Alexa and Siri
Cortana, Google Assistant, Alexa and Siri
When you look at surveys about the actual usage, your might get somewhat disappointed. All this potential is currently used to find out it it’s going to rain tomorrow… We also still need to figure out how to fit this into our lives. It’s OK to talk to a computer when you are in a private space, but it’s still kinda strange when you see a person talking to an imaginary friend.

The challenge for designers is to find ways to make voice natural and smart. Currently we are still following scripts and the system jams if something unexpected happens. A conversation is all about interaction with words in mutual understanding and respect; and in the context of a business; we want our service to be handled just like humans do, only faster. That actually sounds like we are trying to recreate a digital human…

Google Duplex has come a long way; where their AI is really doing a good job in error handling and dealing with erratic scenarios. They showed a demo that almost reaches that next level. This is a very hard thing to do, but their AI seems to handle it pretty well. You could almost say that they finally past the Turing test. You can not distinguish if the other voice is human.. Sure the example is very much focussed on making appointments in two scenarios, but I feel that this is the right direction for the AI and voice to develop in. If we can provide a specific direction and we learn the system to adapt in a certain context we are indeed creating a beter user experience.