Step 7 - Topic modelling 

Print the top 10 topics, showing the top-weighted terms for each topic. Also, include the total weight in each topic as follows:

var sum = 0.0
println(s"${params.k} topics:")
topics.zipWithIndex.foreach {
case (topic, i) =>
println(s"TOPIC $i")
println("------------------------------")
topic.foreach {
case (term, weight) =>
term.replaceAll("s", "")
println(s"$termt$weight")
sum = sum + weight
}
println("----------------------------")
println("weight: " + sum)
println()

Now let's see the output of our LDA model towards topics modeling:

 5 topics:
TOPIC 0
------------------------------
come 0.0070183359426213635
make 0.006893251344696077
look 0.006629265338364568
know 0.006592594912464674
take 0.006074234442310174
little 0.005876330712306203
think 0.005153843469004155
time 0.0050685675513282525
hand 0.004524837827665401
well 0.004224698942533204
----------------------------
weight: 0.05805596048329406
TOPIC 1
------------------------------
thus 0.008447268016707914
ring 0.00750959344769264
fate 0.006802070476284118
trojan 0.006310545607626158
bear 0.006244268350438889
heav 0.005479939900136969
thro 0.005185211621694439
shore 0.004618008184651363
fight 0.004161178536600401
turnus 0.003899151842042464
----------------------------
weight: 0.11671319646716942
TOPIC 2
------------------------------
aladdin 7.077183389325728E-4
sultan 6.774311890861097E-4
magician 6.127791175835228E-4
genie 6.06094509479989E-4
vizier 6.051618911188781E-4
princess 5.654756758514474E-4
fatima 4.050749957608771E-4
flatland 3.47788388834721E-4
want 3.4263963705536023E-4
spaceland 3.371784715458026E-4
----------------------------
weight: 0.1219205386824187
TOPIC 3
------------------------------
aladdin 7.325869707607238E-4
sultan 7.012354862373387E-4
magician 6.343184784726607E-4
genie 6.273921840260785E-4
vizier 6.264266945018852E-4
princess 5.849046214967484E-4
fatima 4.193089052802858E-4
flatland 3.601371993827707E-4
want 3.5398019331108816E-4
spaceland 3.491505202713831E-4
----------------------------
weight: 0.12730997993615964
TOPIC 4
------------------------------
captain 0.02931475169407467
fogg 0.02743105575940755
nautilus 0.022748371008515483
passepartout 0.01802140608022664
nemo 0.016678258146358142
conseil 0.012129894049747918
phileas 0.010441664411654412
canadian 0.006217638883315841
vessel 0.00618937301246955
land 0.00615311666365297
----------------------------
weight: 0.28263550964558276

From the preceding output, we can see that topic five of the input documents has the most weight, at 0.28263550964558276. This topic discusses terms such as captain, fogg, nemo, vessel, and land.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset