While working on NLP applications, we construct meaningful features from the text data. There are many techniques we can use to construct these features, such as count vectorization, binary vectorization, term frequency-inverse document frequency (tf-idf), word embeddings, and more. The following code block demonstrates how to build a tf-idf feature matrix for various NLP applications using the keras library in R:
texts_to_matrix(tokenizer, input, mode = c("tfidf"))
Other modes that are available include binary, count, and freq.