When you are creating a chatbot or in any other NLP applications many challenges do come like getting a text from images, converting speech to text or text to speech, language translation, etc.
Let's have a sneak-peek into each of such tasks and how these can be easily done.
OCR-Optical Character Recognition
We often have a requirement to extract text from pictures, we have a library easyocr to do this job easily for you.
This library has support for various languages from English, Hindi, Frech, Chinese, Kannada, Malayalam, and so on.
It uses 3 main components Resnet, transfer learning model, sequence labeling of (LSTM), and decoding (CTC). …
There is an old saying “Accuracy builds credibility”-Jim Rohn.
However, accuracy in machine learning may mean a totally different thing and we may have to use different methods to validate a model.
When we develop a classification model, we need to measure how good it is to predict. For its evaluation, we need to know what do we mean by good predictions.
There are some metrics that measure and evaluate the model on its accuracy of actually predicting the class and also improves it.
Let us see a Confusion matrix that defines a number of rightly predicted happy, sad, and also wrongly predicted happy and sad. …
In Natural Language Processing (NLP), it is not only important to make sense of words but context too.
N-grams are one of the ways to understand the language in terms of context to better understand the meaning of words written or spoken.
For example, “I need to book a ticket to Australia.” versus “I want to read a book of Shakespeare.”:
If you have a stack of websites or a stack of pdf files and you want to have answers to your questions, it looks like a hell of a task.
What if we can do it in a few lines of code using BERT.
BERT is a pre-trained transformer-based model. Here we will be using bert-squad1.1, this model is pre-trained on squad (Stanford Question Answer Dataset) dataset, which is a typical benchmark problem for question-answer models. It consists of more than 100,000 questions based on Wikipedia snippets. Also, it is annotated with the corresponding text span i.e …
In real-world response time for a chatbot matters a lot. Be it the travel industry, banks, or doctors, if you want to really help your customers, response time should be less, and similar to what it is while talking to a customer care representative.
Besides the time it is also important to understand the main motive of the chatbot, every industry cannot use a similar chatbot as they have different purposes and have a different set of corpus to reply from.
While transformers are good to get a suitable reply, it may take time to respond back. On the other hand where time is concerned various other methodologies can be applied and even find some rule-based systems to get an appropriate reply which is apt for the question asked. …
Processing language has come a long way. Starting from Bag of words, to Recurrent Neural networks to Long Short Term Memories, and after overcoming the problems each had, it has improved. Bag of Words is a kind of sparse matrix where, if we have a vocabulary of 10 million words, each word will be represented by a sparse matrix with the majority of zeroes and a one where an index of a word is. RNNs were good to handle a sequence of words but there was a problem of vanishing and exploding gradient problems, it was very good at keeping sequence information but not very long-term ones. …
In my last blog we discussed about shortcomings of RNN which had vanishing gradient problem, which results in not learning longer sequences, responsible for short term memory.
LSTMs and GRUs are seen as solution to short term memories. Now let’s see the functioning of it to understand it.
These have a mechanism of gates which manage the flow of information.
These gates decide which sequence of information to keep or throw away. It is used to store relevant information and forget the information not required.
Almost all state of art of RNN implementation are achieved by LSTMs or GRUs.
These models are used in speech recognition, speech synthesis, and text generation and even to make relevant caption for an image or video. …
There is a quote which says “A sequence works in a way, a collection never can.”- George Murray
In my last blog about NLP I had taken topics of Bag of Words, tokenization, TF-IDF, Word2Vec, all of these had a problem like they don’t store the information of semantics.
Semantics means the sequence of words, like which word was spoken before or after a word.
It is important information to keep in language processing if we have to interpret it in a right way.
For example, if I say “You are beautiful but not intelligent” and we are not able to keep semantics, the sentence may mean differently with only collection of words. …
To tell things briefly and in a meaningful way is the best strategy to communicate. Now in language processing achieving this is not an easy task.
We have already learnt about word2Vec , BagOfWords, lemmatization and stemming in my last blog on NLP.
Now here we will try to understand what is word embedding and we will also implement it in python using keras.
When we have a large vocabulary size like 10000 words, vectorizing it and converting it into one-hot encoding will result in a very high dimension dataset which is sparse i.e. lot of zeros and only few index having value one. …
Weights are responsible for connection between the units, in neural networks, these can be initialized randomly and then get updated in back propagation in order to reduce the loss.
Few important things to keep in mind before initializing weights:
1) Weights should be small but not too small as it gives problems like vanishing gradient problem( vanish to 0). That is it will take forever to converge to global minima.
Weights can’t be too high as gives problems like exploding Gradient problem(weights of the model explode to infinity), which means that a large space is made available to search for global minima hence convergence becomes slow. …