Title PageCopyright and CreditsHands-On Python Natural Language ProcessingAbout PacktWhy subscribe?ContributorsAbout the authorsAbout the reviewersPackt is searching for authors like youPrefaceWho this book is forWhat this book coversTo get the most out of this bookDownload the example code filesDownload the color imagesConventions usedGet in touchReviewsSection 1: IntroductionUnderstanding the Basics of NLPProgramming languages versus natural languagesUnderstanding NLPWhy should I learn NLP?Current applications of NLPChatbotsSentiment analysisMachine translationNamed-entity recognitionFuture applications of NLPSummaryNLP Using PythonTechnical requirementsUnderstanding Python with NLP Python's utility in NLPImportant Python librariesNLTKNLTK corporaText processingPart of speech taggingTextblobSentiment analysisMachine translationPart of speech taggingVADERWeb scraping libraries and methodologyOverview of Jupyter NotebookSummarySection 2: Natural Language Representation and MathematicsBuilding Your NLP VocabularyTechnical requirementsLexiconsPhonemes, graphemes, and morphemesTokenization Issues with tokenizationDifferent types of tokenizersRegular expressions Regular expressions-based tokenizersTreebank tokenizerTweetTokenizer Understanding word normalizationStemmingOver-stemming and under-stemmingLemmatization WordNet lemmatizerSpacy lemmatizerStopword removalCase foldingN-gramsTaking care of HTML tagsHow does all this fit into my NLP pipeline?SummaryTransforming Text into Data StructuresTechnical requirementsUnderstanding vectors and matricesVectorsMatricesExploring the Bag-of-Words architectureUnderstanding a basic CountVectorizerOut-of-the-box features offered by CountVectorizerPrebuilt dictionary and support for n-gramsmax_featuresMin_df and Max_df thresholdsLimitations of the BoW representationTF-IDF vectorsBuilding a basic TF-IDF vectorizerN-grams and maximum features in the TF-IDF vectorizer Limitations of the TF-IDF vectorizer's representationDistance/similarity calculation between document vectorsCosine similaritySolving Cosine mathCosine similarity on vectors developed using CountVectorizerCosine similarity on vectors developed using TfIdfVectorizers toolOne-hot vectorizationBuilding a basic chatbotSummary Word Embeddings and Distance Measurements for TextTechnical requirementsUnderstanding word embeddingsDemystifying Word2vecSupervised and unsupervised learningWord2vec – supervised or unsupervised?Pretrained Word2vec Exploring the pretrained Word2vec model using gensimThe Word2vec architectureThe Skip-gram methodHow do you define target and context words?Exploring the components of a Skip-gram modelInput vectorEmbedding matrixContext matrixOutput vectorSoftmaxLoss calculation and backpropagationInferenceThe CBOW methodComputational limitations of the methods discussed and how to overcome themSubsamplingNegative samplingHow to select negative samplesTraining a Word2vec model Building a basic Word2vec modelModifying the min_count parameter Playing with the vector sizeOther important configurable parametersLimitations of Word2vecApplications of the Word2vec model Word mover’s distanceSummaryExploring Sentence-, Document-, and Character-Level EmbeddingsTechnical requirementsVenturing into Doc2VecBuilding a Doc2Vec modelChanging vector size and min_count The dm parameter for switching between modeling approachesThe dm_concat parameterThe dm_mean parameterWindow sizeLearning rateExploring fastText Building a fastText modelBuilding a spelling corrector/word suggestion module using fastTextfastText and document distancesUnderstanding Sent2Vec and the Universal Sentence EncoderSent2VecThe Universal Sentence EncoderSummary Section 3: NLP and LearningIdentifying Patterns in Text Using Machine LearningTechnical requirementsIntroduction to MLData preprocessingNaN valuesLabel encoding and one-hot encodingData standardizationMin-max standardizationZ-score standardizationThe Naive Bayes algorithmBuilding a sentiment analyzer using the Naive Bayes algorithmThe SVM algorithmBuilding a sentiment analyzer using SVMProductionizing a trained sentiment analyzerSummary From Human Neurons to Artificial Neurons for Understanding TextTechnical requirementsExploring the biology behind neural networksNeuronsActivation functionsSigmoidTanh activationRectified linear unitLayers in an ANNHow does a neural network learn?How does the network get better at making predictions?Understanding regularizationDropoutLet's talk KerasBuilding a question classifier using neural networksSummaryApplying Convolutions to TextTechnical requirementsWhat is a CNN?Understanding convolutionsLet's pad our dataUnderstanding strides in a CNNWhat is pooling?The fully connected layerDetecting sarcasm in text using CNNsLoading the libraries and the datasetPerforming basic data analysis and preprocessing our dataLoading the Word2Vec model and vectorizing our dataSplitting our dataset into train and test setsBuilding the modelEvaluating and saving our modelSummaryCapturing Temporal Relationships in TextTechnical requirementsBaby steps toward understanding RNNsForward propagation in an RNNBackpropagation through time in an RNNVanishing and exploding gradientsArchitectural forms of RNNsDifferent flavors of RNNCarrying relationships both ways using bidirectional RNNsGoing deep with RNNsGiving memory to our networks – LSTMsUnderstanding an LSTM cellForget gateInput gateOutput gateBackpropagation through time in LSTMsBuilding a text generator using LSTMsExploring memory-based variants of the RNN architectureGRUsStacked LSTMsSummaryState of the Art in NLPTechnical requirementsSeq2Seq modelingEncodersDecodersThe training phaseThe inference phaseTranslating between languages using Seq2Seq modeling Let's pay some attentionTransformers Understanding the architecture of TransformersEncoders DecodersSelf-attentionHow does self-attention work mathematically?A small note on masked self-attentionFeedforward neural networksResiduals and layer normalizationPositional embeddingsHow the decoder worksThe linear layer and the softmax functionTransformer model summaryBERT The BERT architectureThe BERT model input and outputHow did BERT the pre-training happen?The masked language modelNext-sentence prediction BERT fine-tuningSummaryOther Books You May EnjoyLeave a review - let other readers know what you think