A Promise Made

It has been almost 5 years since Google first showed off their voice assistant Google Duplex. Dressed casually in a bomber jacket, Sundar Pichai waltzed onto the stage and enthralled the audience with actual phone calls made by a piece of technology to book a haircut appointment at a salon. The interaction was mind-blowing. Not only was the appointment set up, but the nuances of a human conversation were adopted, so much so that the salon representative had no idea that she was not speaking to a real person. Pichai went on to say that the technology had elements of natural language understanding, deep learning, text-to-speech, and many others. The premise was set up beautifully for the possibilities of Artificial Intelligence (AI) or more specifically, conversational AI in this context.

Cut to the present day, where the tech giants Amazon and Google are the Federer/Nadal of voice assistants. Both are pushing their products very aggressively, and a lot of us are already hooked on them. From waking up in the morning listening to John Lennon asking us to ‘imagine’ living for today; to getting an ‘affirmation’ from George Benson as we hit the hay at night, voice assistants have taken over our life.

They now offer multiple voice profiles, so you can train the device to recognize your specific voice and provide some degree of personalized response based on this. They can answer general knowledge questions, play music (wait, I have said this already), help make purchase orders on e-commerce, hark out directions, and a whole lot more. They can control many of your smart home devices, and new features are being added every week.

Admittedly, the breadth of progress has been great. However, the seamless conversation that we heard played out 5 years back is yet to become a complete reality. The conversation is almost never without glitches, with any correction or change in direction mid-sentence not leading to the desired outcome. Often the context is missed completely, and the lack of memory does not lead to a very engaging conversation either. So, what’s with that!!

Brace for the jump

When I was a little kid, I heard the story of the foolish farmer, who chose a million grains of seed over one single seed on the first square of a chessboard, to be doubled for every square for all 64 squares. Truth is, exponential acceleration is not so easy to fathom. Most people in the 1980s would not have believed that an average computer in the 2020s would have the same computing powers as a human brain and would fit into the palm of our hands. Today, we may find it difficult to believe that in another couple of decades, the same average computer will have the power of all human brains put together.

Bottomline is that technology moves very quickly and what seems to be a pipe dream for conversational AI today will be a reality in no time. Any technology in its infancy has been scoffed at, till an inflection point was reached, and then there was no stopping it. The first chess AI to challenge humans was IBM’s Deep Thought, and the legendary Gary Kasparov made small work of it in 1989 and then defeated its successor Deep Blue in 1996. But in 1997, Deep Blue won the rematch. Since then, technology has improved by leaps and bounds. Modern chess AI engines regularly have FIDE ratings above 3,400, far beyond the best human players. Deception and hype before disruption and eventual democratization.

And then there is the power of convergence. As technologies in different areas exponentially accelerate, they start to overlap with each other, and this convergence stupendously increases the potential for disruption. The first boost in AI was provided by the convergence of enormous data sets (think unstructured data thanks to social media and the internet) and incredibly powerful and cheap graphics processors. Next, neural networks went online and suddenly unsupervised learning from unstructured data became possible. And the outcomes are for all to see. From AI-powered drones ‘watching’ over us, Alexas of the world ‘listening’ to us, Chat GPTs ‘writing’ entire books for us, and finally AIs such as Google’s AlphaGo ‘integrating’ knowledge to beat us at Go, a game that is so complex that it is called the Chess for superheroes. The time of AI has well and truly arrived.

Fast forward

Conversational AI might be taking its time to reach the level of sophistication that Google Duplex promised 5 years back, but is very likely to become ubiquitous in its application in the next few years. Today Alexa might be a ‘touch and go’ when it comes to executing purchase decisions on voice command, but with an accelerating rate of improvement in the technology, it is bound to make online commerce frictionless for us, maybe even invisible in the time to come.

Imagine a future where you ask your digital AI-fuelled assistant to make a purchase of a product. In the blink of an eye, the AI considers all claims made by all possible brands for the product and makes a purchase. Now throw the Internet of Things (IoT) in the mix. If every part of your home is ‘smart’, then your assistant can potentially monitor your consumption of items and will order supplies before you realize that the purchase needs to be made. Now here comes the funny part… if your machines are talking to each other and making purchase decisions on your behalf, what will happen to good old advertising?

As we look towards a future where conversational AI will become ubiquitous, we can’t help but think of Jarvis from Ironman. “Sir, I have indeed been uploaded, and I am ready to assist you in every way possible.” A promise made by Jarvis to Tony Stark, and a promise being fulfilled by the technology giants of today. So, let’s brace for the jump, embrace the power of convergence, and get ready for a future where machines will not only talk to us but also make purchase decisions on our behalf. As for advertising, well, it might just have to find a new way to reach us.