Questions: What is a human language in fact?
Answer : Human language is actually a signal, not just any environmental signal but a deliberate communication signal that two humans use to communicate with each other. Hence the task of understanding a human language can also be thought of as a way of decoding the signal.
A human language hence is a discrete / symbolic / categorical signalling system. Discrete in that the words are not continuous. Symbolic in that words are representation of some symbols and categorical in that a set of words represent a certain category e.g “good, excellent” are two different discrete words that represent same meaning or same category.

Questions: What complicates learning the language? So what is the biggest challenge with teaching the machine to understand the language?
Answer: Well, we have millions of word vocabulary, which makes the language a very sparse . This language sparsity is a big challenge to help machine understand language.

Problem : Where is NLP used  ?
Answer :  Can be used across multiple domain. Some of the most prominent examples include
1. Spell Check
2. Find company names in articles to find the network of companies
3. Price extraction from websites
4. Sentiment Analysis
4. Machiene Translation
5. Spoken dialog systems : e.g alexa
6. Complex question answering

Problem : Where is it used in Industry ?
Answer :   They are used across
1. Translation
2. Sentiment analysis for marketing or finance / trading
3. Speech Recognition

Question :  Does it always work well?
Answer : No, it doesn’t. One example with Sentiment Analysis system is “When Anne Hathway started a movie that has really good reviews, then the stock of the Berkshare Hathway went up”. It’s a simple example of when it doesn’t work and why NLP is hard.

Question : Why is NLP hard ?
Answer : Well, we are not a really fast communicator, unlike the 3G, 4G speeds. We are slow communicators, hence we have to be able to communicate a lot with using minimal sentences. And what th elistener does is, he is able to fill in the missing pieces by using his common knowledge, context e.t.c This is the biggest reason why NLP is hard
Example froma Times headline
“The Pope’s baby steps on gays”.
You would not want to interpret it as “The (Pope’s baby) steps on gays”.
but “The Pope’s (baby steps) on gays”.
Complex and situational  Representation : Because, text has complex and situational representation. For e.g
1. Same pronoun may refer to different noun, depending on the verbs used in the sentence. e.g
“Jane hit June and then she [fell / ran]”
When verb in the sentence is  “fell”, the pronoun “she” refers to   “June”
while, when the  verb    in the sentence is “ran”, the pronoun she will refer to “Jane”
2. Text ambiguity : i.e The sentence “I made her duck” may mean different depending on context. Such as it may mean
a.  I cooked her  duck
b. I made her duck behind the desk.
c. I made her a wooden duck.

Question : What was the old school way of doing NLP ?
Answer : Old-school way of doing NLP, was with human designated feature extraction and represention. For example, to idntify nouns, we will extract features as “First letter capital”, “Preceding character is Fullstop”


traidional ML.JPG

Other approaches such as Using taxonomy like WordNet with hypernyms, have been tried earlier. Wordnet is basically a large graph, that captures the relationship i.e “is-a”, “has”. With Wordnet, every noun is  defined with relationship e.g “Panda” is a  carnivore, lacental, mammal, vertebrate, chordate e.t.c. For example

old NLP meaning capture.JPG

Question : What are its drawbacks ?
Answer : Discrete representation is hard. For example, good  has different synonyms i.e adept, expert, proficient, ninja. And that the language is evolving. Further more what works in English with “WordNets”, will not work in other language
– requires human labor to create wordnets
– Language is evolving in nature and  we have new words every new year.

Also furthermore, The old school feature extraction  is not generic, to be applicable for other language, or when rules change. For e.g. Noun identification will not work for Chinese language or other languages.

Question : What is the problem with discrete representation ?
Answer : Well, same word can mean different things in different context. With discrete representation, this is not captured at all, since we are handling each words as discrete, in atomic way without any context.

To find out, how is the “context” captured in the NLP, see Why NLP with Deep Learning

Question : So what can we do about it. Can you develop a model that can be applicable across all language ?
Answer : Yes we can. If we use a Deep learning model, which can do the feature extraction, such as the human extracted ones by itself, then  we will have a model that is applicable across all languages.
The underlying fundamentals of the word to NLP with deep learning – or word vectorisation or word 2 vec is that “Similar words occur in almost the same environments” e.g
“oculist and eye-doctor.. occur in almost the same environment”

Question: I have been hearing a lot about this deep learning. How is it  better from the traditional ML models?
Answer : What we have been doing earlier was humans were developing the features by hand to solve a certain set of problems. This was done in the google as well up recently until 2015. What they used to do was some groups designed a feature, and they will show with some experiments that it increased the google search results and then code was then thrown out in the algorithm. This was advertised as the machine learning, however when looked at other way, in fact the machine was learning nothing. It turns out all machine was doing was numeric optimization and the humans were learning a lot about the problem domain.
Machine learning, back in the old days  was more about 90% human developing the features, while machine was doing the 10% task of numeric optimization.

Figure 1 traditional ml approaches.JPG

See NLP with Deep Learning at