Natural Language Processing – how does it work?

Since 2016, chatbots are increasingly present on apps and websites, whether commercial or service, and even on intranets. They are equipped with increasingly advanced functionalities; personalisation of responses, context awareness, emotion detection, animated avatars, etc. This helps make the conversation as rounded and natural as possible, to guarantee the very best user experience. Despite a great deal of work on the bot’s personality, design and knowledge, user satisfaction primarily lies in the bot’s capacity to understand and process natural language.

The very essence of natural language processing is to understand the user’s intent in order to provide them with the most appropriate response to their request. In this article, we will explain how this technology works.

What is an NLP algorithm?

In NLP (Natural Language Processing) software, there are three components:

  • A visual interface (dialogue box or chatbox) where the user can interact directly with the chatbot
  • A natural language processing (NLP) engine that uses an algorithm to understand the user’s request (otherwise known as an intent)
  • An admin console (in dydu’s case) that allows to manage all the responses to be provided to the user and to manage the chatbot on a daily basis via performance indicators

An algorithm is a set of operating rules whose application solves a problem. In general, computers are skilled at performing repetitive tasks without ever getting bored. It is therefore possible to describe an algorithm with calculations and computer methods, and for them to repeat the task as often as necessary. The NLP algorithm is a computer programme that has been taught to identify a user’s intent through a series of predefined examples.

How does it work?

NLP chatbot

To understand the user’s intent, the algorithm analyses the sentence’s linguistic structure, by dividing it up into words or compound words.

To understand the meaning of a sentence, each word in its composition must be interpreted. Yet, homonyms and polysemous words make this interpretation particularly complicated.

The NLP algorithm then associates each word with a set of meanings and each meaning is correlated to a “penalty”, according to its occurrence probability. The weight of a word therefore depends on how frequently it is used in the given language.

To interpret a user’s intent, there are several necessary steps:

  • Correct spelling

Spelling mistakes are commonplace in the way users express themselves, particularly on the internet. The algorithm suggests several possible corrections, and each correction is associated with a score.

  • Identify lemmas

For each word, the algorithm searches for the different accessible lemmas. A lemma is the base form of a word without any endings, such as an infinitive verb or a singular noun. Links towards lemmas can be defined for common abbreviations, such as: “asap → as soon as possible”.

  • Identify synonyms and hypernyms

A hypernym is a generalisation of meaning. Hypernyms are essentially used to define a set of products or terms specific to the assistant’s business logic. For example, “dog” and “cat” are not synonyms but “animal” is a hypernym of these two terms.

Synonyms of lemmas as well as hypernyms are identified and associated with the corresponding words. This allows to reduce the number of utterances necessary and to improve the algorithm’s understanding.

  • Calculating the distance between words (distance measuring)

The algorithm directly calculates the distance between the user’s question and reference phrases that already exist in the knowledge base. The closer a user’s question and reference (same words/lemmas, in the same order or not), the higher the score.

  • Obtaining a final score

A user’s question is matched to a reference in the base using what is called a matching score. In dydu’s solution, this score ranges from 0 to 1024. 0 means that the two sentences have nothing in common, 1024 means that both sentences are identical.

The chatbot can provide three possible responses, depending on the matching score:

  • between 800 and 1024, the bot considers that the user’s intent has been understood and immediately provides the response associated with the reference question in the base
  • between 400 and 800, the bot suggests one or more rewordings to confirm the user’s initial question has been correctly understood
  • between 0 and 400, the bot informs the user that the request has not been understood and asks for them to reword it. The bot can also suggest moving over to another communication channel (after several unsuccessful attempts to understand)

What advantages does this technology offer?

dydu’s technology relies on distance calculation, but there are other methods for understanding natural language, such as syntactic analysis and keyword matching:

  • Syntactic analysis focuses on sentence structure and is dependant on the language in which the utterance is expressed (subject-verb-complement in French). This method allows a good understanding of the sentence and to avoid any errors (even when the nuance is subtle), but also relies heavily on grammar, making it difficult to apply to everyday language. 
  • The keyword matching method works in the same way as a search engine. Although the method is simple to implement, it struggles to distinguish between two sentences with very close meanings.

The advantage of the distance calculation method is that it allows a precise understanding of a sentence and does not require the user’s utterances to be grammatically correct. It does however require a learning period for the bot based on the first user questions.

If the bot you are dialoguing with is based on a powerful algorithm and a well-constructed base of intents, then the user will obtain the answers they are looking for, even if they make mistakes, digress, talk about several subjects at once, or do not provide all the necessary information.

If you would like to implement a chabot or vocal conversational robot (voicebot or callbot), it is essential to choose a publisher equipped with a powerful NLP technology, who can also connect to other matching engines to further improve the overall understanding of user intents.

Are you interested in implementing a conversational robot for your clients or employees? Do not hesitate to ask for a demo of the dydu solution.

You can also test our solution free of charge for 14 days.