Step 4 - Set the NLP optimizer

For better and optimized results from the LDA model, we need to set the optimizer that contains an algorithm for LDA, and performs the actual computation that stores the internal data structure (for example, graph or matrix) and other parameters for the algorithm.

Here we use the EMLDAOPtimizer optimizer. You can also use the OnlineLDAOptimizer() optimizer. The EMLDAOPtimizer stores a data + parameter graph, plus algorithm parameters. The underlying implementation uses EM.

First, let's instantiate the EMLDAOptimizer by adding (1.0 / actualCorpusSize) along with a very low learning rate (that is, 0.05) to MiniBatchFraction to converge the training on a tiny dataset like ours as follows:

val optimizer = params.algorithm.toLowerCase 
match {
case "em" =>
new EMLDAOptimizer
// add (1.0 / actualCorpusSize) to MiniBatchFraction be more robust on tiny datasets.
case "online" =>
new OnlineLDAOptimizer().setMiniBatchFraction(0.05 + 1.0 / actualCorpusSize)
case _ =>
thrownew IllegalArgumentException("Only em, online are supported but got
${params.algorithm}.")
}

Now, set the optimizer using the setOptimizer() method from the LDA API as follows:

lda.setOptimizer(optimizer)
.setK(params.k)
.setMaxIterations(params.maxIterations)
.setDocConcentration(params.docConcentration)
.setTopicConcentration(params.topicConcentration)
.setCheckpointInterval(params.checkpointInterval)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset