Text summarization in NLP means telling a long story in short with a limited number of words and convey an important message in brief.
There can be many strategies to make the large message short and giving the most important information forward, one of them is calculating word frequencies and then normalizing the word frequencies by dividing by the maximum frequency.
After that finding the sentences with high frequencies and taking the most important sentences to convey the message.
Why do we need automatic summarization?
Time optimization, as it takes lesser time to get the gist of the text in…
“With public sentiment, nothing can fail, without it, nothing can succeed”-Abraham Lincoln
Sentiment analysis is a field in which we try to use natural language processing for text analysis, to identify emotion for a service, product, or entity. Sentiment analysis or opinion mining is used to determine whether data is positive, negative, or neutral.
Types of Sentiment Analysis
Subjective or objective classification:
Any objective text or speech is related to the facts, but any subjective text or speech has feelings. They are opposites.
Objective: “The weather is Sunny”.
Subjective: “I love sunny weather!”
In this type of sentiment analysis, we…
Sentiment Analysis is a field that has a lot of scope and application into recommendation systems.
Be it movie reviews, stock market, product, or groups, sentiments play a huge role in analyzing the trend and future of a product or service.
To analyze sentiments, different fields or domains may have totally different rules, like for example, the polarity of the words in reviews will decide sentiments, while for the stock market a different set of rules like a day’s hike or dip in its share price will decide the sentiment of news.
Here, I will take an example of the…
In one of my last blogs, I tried to explain text generation through a simple Maximum Likelihood model.
In this blog, I will try to explain how can we do the same through the Bidirectional LSTM model.
In one of my blogs of RNN, we talked about all types of RNNs but they had a shortcoming, i.e, dependency on context only from the past.
Bidirectional LSTMs can be used to train two sides, instead of one side of the input sequence. First from left to right on the input sequence and the second in reversed order of the input sequence…
When you are creating a chatbot or in any other NLP applications many challenges do come like getting a text from images, converting speech to text or text to speech, language translation, etc.
Let's have a sneak-peek into each of such tasks and how these can be easily done.
OCR-Optical Character Recognition
We often have a requirement to extract text from pictures, we have a library easyocr to do this job easily for you.
This library has support for various languages from English, Hindi, Frech, Chinese, Kannada, Malayalam, and so on.
It uses 3 main components Resnet, transfer learning model…
There is an old saying “Accuracy builds credibility”-Jim Rohn.
However, accuracy in machine learning may mean a totally different thing and we may have to use different methods to validate a model.
When we develop a classification model, we need to measure how good it is to predict. For its evaluation, we need to know what do we mean by good predictions.
There are some metrics that measure and evaluate the model on its accuracy of actually predicting the class and also improves it.
Let us see a Confusion matrix that defines a number of rightly predicted happy, sad, and…
In Natural Language Processing (NLP), it is not only important to make sense of words but context too.
N-grams are one of the ways to understand the language in terms of context to better understand the meaning of words written or spoken.
For example, “I need to book a ticket to Australia.” versus “I want to read a book of Shakespeare.”:
If you have a stack of websites or a stack of pdf files and you want to have answers to your questions, it looks like a hell of a task.
What if we can do it in a few lines of code using BERT.
BERT is a pre-trained transformer-based model. Here we will be using bert-squad1.1, this model is pre-trained on squad (Stanford Question Answer Dataset) dataset, which is a typical benchmark problem for question-answer models. It consists of more than 100,000 questions based on Wikipedia snippets. Also, it is annotated with the corresponding text span i.e …
In real-world response time for a chatbot matters a lot. Be it the travel industry, banks, or doctors, if you want to really help your customers, response time should be less, and similar to what it is while talking to a customer care representative.
Besides the time it is also important to understand the main motive of the chatbot, every industry cannot use a similar chatbot as they have different purposes and have a different set of corpus to reply from.
While transformers are good to get a suitable reply, it may take time to respond back. On the other…
Processing language has come a long way. Starting from Bag of words, to Recurrent Neural networks to Long Short Term Memories, and after overcoming the problems each had, it has improved. Bag of Words is a kind of sparse matrix where, if we have a vocabulary of 10 million words, each word will be represented by a sparse matrix with the majority of zeroes and a one where an index of a word is. RNNs were good to handle a sequence of words but there was a problem of vanishing and exploding gradient problems, it was very good at keeping…
Data Science Professional | Technical Blogger | Artificial Intelligence | NLP | Chatbots and more