How to do it…

Let's perform the following steps to create a word cloud:

  1. Perform a literature search through PubMed and the retrieve abstract texts.
  2. Perform pre-processing using the tm library.
  3. Create a term document matrix.
  4. Create a word cloud.

The preceding steps are implemented as follows:

        library(pubmed.mineR)
library(RISmed)
keyword <- "Deep Learning"
search_query <- EUtilsSummary(keyword, retmax=50)
summary(search_query)
extractedResult <- EUtilsGet(search_query)
pmid <- PMID(extractedResult)
years <- YearPubmed(extractedResult)
Jtitle <- Title(extractedResult)
articleTitle <- ArticleTitle(extractedResult)
abstracts <- AbstractText(extractedResult)
library(tm)
AbstractCorpus <- Corpus(VectorSource(abstracts))
AbstractCorpus <- tm_map(AbstractCorpus,
content_transformer(tolower))
AbstractCorpus <- tm_map(AbstractCorpus, removePunctuation)
AbstractCorpus <- tm_map(AbstractCorpus, removeNumbers)
Stopwords <- c(stopwords('english'))
AbstractCorpus <- tm_map(AbstractCorpus, removeWords,
Stopwords)
AbstractCorpus <- tm_map(AbstractCorpus, stemDocument)
trmDocMat <- TermDocumentMatrix(AbstractCorpus, control =
list(minWordLength = 1))
  1. Now, you have the term document matrix. The next step is to create the word cloud visualization. To create the word cloud, the first thing you need is the frequency of each term in the whole corpus. You can achieve this task easily by taking the row sum of the term document matrix. Here is the code to calculate the frequency of each term in the whole corpus:
        tdmMatrix <- as.matrix(trmDocMat)
freq <- sort(rowSums(tdmMatrix),decreasing=TRUE)
tdmDat <- data.frame(word = names(freq),freq=freq)
rownames(tdmDat) <- NULL
  1. Finally, the word cloud has been created using the following code:

library(wordcloud)
wordcloud(tdmDat$word,tdmDat$freq,rot.per=.15,min.freq=10)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset