Human brains are singularly special in the animal kingdom, writes Andrew Quixley, Data Science and AI Sales Lead, IBM South Africa. We are the curious, communicative collaborators who rose from a simple foraging existence on the savannahs to build structures of incredible complexity.
Sure, other animals are curious, communicative and collaborative too. An octopus will investigate and solve problems; elephants use infrasound to communicate over vast distances; termites collaborate to build structures that are millions of times larger than any individual; but none of these feats comes close to the scale of human complexity. And the key to this complexity is our ability to generate language.
Language by the numbers
There are 7 099 official languages on Earth. If we focus on just English, the 26 different letters can be combined in different ways to create the approximately 171 000 words of modern English. The average person’s vocabulary falls in a range of 20 000-35 000 words, and everyday use requires far fewer.
When we use our words to communicate, something truly remarkable happens. We decide consciously what it is we intend to convey to another person and then select the words that will serve the purpose; we sequence those words and we apply some structure to fit with the rules of grammar and syntax. What comes out of our mouths are sentences.
So how hard should it be to finish a sentence? If we consider it from the arithmetic alone, the next word has to be one of the 171 000 words. If we add the rules of English, that will be chopped down considerably. But if we extend beyond the problem of what the next word is, and start looking several words out, the permutations and complexity goes up significantly.
It’s not hard to see that, based on numbers alone, it’s very tricky to finish sentences using guesswork – or to originate sentences, which is the aim of a text generator. By the time a sentence grows in length to 16 words, the number of possible 16-word combinations exceeds the number of atoms in the universe.
Can machines talk?
Machine-origination of text does happen in natural language generation engines, but the applications are generally limited to creating narrative formats of business intelligence dashboards, or forecasts from weather data. The language is constrained within tightly prescribed boundaries – which is not what happens when two friends get together over a cup of coffee.
Similarly, almost every public appearance of a talking robot so far has been a pre-defined conversation in which the words were scripted by a human. We are not “knowing the mind” or the thoughts or the consciousness of the robot just by hearing it speak, but rather the words written by its human handlers.
IBM Project Debater
IBM’s grand challenge in the last few years was the project to develop a machine that could win the Jeopardy! game show, which its artificial intelligence, Watson, accomplished in 2011. Since then IBM Research has been working on a new challenge – to create a machine that could debate with a human under standard competitive debating conditions.
The “machine” is called Project Debater and its abilities were revealed at IBM Think in February 2019, when it took on human Harish Natarajan, who is a world champion debater. The format of the debate means that each player is given a motion, for or against which to argue. The audience is asked to vote for or against before the debate, to establish a baseline. The competitors have a few minutes to prepare an opening argument, before they take turns to pitch their opening argument. They then prepare rebuttals and again, take turns to pitch their rebuttal arguments. At the end, the audience is polled once more and the competitor who did most to change the initial opinions is deemed the winner.
To accomplish the task, Project Debater has to be able to understand natural language inputs (written and spoken), recognise the meaning of the argument it needs to make, research through millions of documents to identify statements of fact that support the argument, parse or paraphrase these, sequence them logically and then add back the necessary building blocks of language (such as conjunctions) to create coherent sentences that obey the laws of the language.
Then Project Debater must repeat the process with the added complexity of trying to rebut the argument of the human competitor.
Where to next?
It’s an immensely impressive feat of AI engineering to get to this point. Yet even this debate is nowhere near as complex as a free-form conversation with a human. Getting to that level will be one of the key milestones on the path to artificial general intelligence.
However, human-level natural language generation is applied in the world, it’s sure to raise the bar in terms of the human-machine interface, making any so-equipped machine seem far more human, and correspondingly more pleasing and engaging to deal with.
Machines conversing lucidly with humans is what fiction writers and movie directors have envisioned for decades. Until that day arrives, perhaps it’s enough to recognise how immensely complicated it is to make conversation with another human, to enjoy the ability and to take every opportunity to do so.