Our Python Deep Learning Projects team is doing good work and our (hypothetical) business use case has expanded! In the last project, we were asked to accurately classify handwritten digits to generate a phone number for an available table seating notification text to be sent out to patrons of a restaurant chain. What we learned after the project, was that the text that the restaurant sent out had a message that was friendly and well received. The restaurant was actually getting texts back!
The notification text was: "We're excited that you're here and your table is ready! See the greeter, and we'll seat you and your party now. :) "
Response texts were varied and usually short, but it was noticed by the greeter and the restaurant management and they started thinking that maybe they could use this simple system to get feedback on the dining experience. This feedback would provide needed business intelligence on how the food tasted, how the service was delivered, and the overall quality of the experience.
In this chapter, we introduce the foundational knowledge about deep learning for computational linguistics.
We present the role of the dense vector representation of words in various computational linguistic tasks, and how to construct them from an unlabelled monolingual corpus.
Then we'll present the role of language models in various computational linguistic tasks such as text classification, and how to construct them from an unlabelled monolingual corpus using Convolutional Neural Networks (CNN). We'll explore Convolutional Neural Networks architecture for language modeling.
There are several ways in which we can perform word embeddings such as One-hot encoding, GloVe, Word2vec and many more, each one of them has some pros and cons. Our current favorite is Word2VEC, because it has been proven to be the most efficient approach when it comes to learning the high-quality features.
If you have ever worked before on a use-case in which the input data is in forms of text, then you know that is a really messy affair because you have to teach a computer about the irregularities about the human language which has lots of ambiguities and you have to teach is sort of like hierarchical and the sparse nature of language grammar. So this is the kind of promises that word vectors solve by removing the ambiguities and make all different kinds of concepts similar.
In this chapter, we will learn how to build word2vec models and analyze what characteristics we can learn about the provided corpus. Also, we will learn how to build a language model utilizing a CNN with trained word vectors.