Graph for Scala

For this project, I will create a src/main/scala/InfluenceDiagram.scala file. For demo purpose, I will just recreate the graph from Chapter 2, Data Pipelines and Modeling:

import scalax.collection.Graph
import scalax.collection.edge._
import scalax.collection.GraphPredef._
import scalax.collection.GraphEdge._

import scalax.collection.edge.Implicits._

object InfluenceDiagram extends App {
  var g = Graph[String, LDiEdge](("'Weather'"~+>"'Weather Forecast'")("Forecast"), ("'Weather Forecast'"~+>"'Vacation Activity'")("Decision"), ("'Vacation Activity'"~+>"'Satisfaction'")("Deterministic"), ("'Weather'"~+>"'Satisfaction'")("Deterministic"))
  println(g.mkString(";"))
  println(g.isDirected)
  println(g.isAcyclic)
}

The ~+> operator is used to create a directed labeled edge between two nodes defined in scalax/collection/edge/Implicits.scala, which, in our case, are of the String type. The list of other edge types and operators is provided in the following table:

The following table shows graph edges from scalax.collection.edge.Implicits (from http://www.scala-graph.org/guides/core-initializing.html)

Edge Class

Shortcut/Operator

Description

Hyperedges

HyperEdge

~

hyperedge

WHyperEdge

~%

weighted hyperedge

WkHyperEdge

~%#

key-weighted hyperedge

LHyperEdge

~+

labeled hyperedge

LkHyperEdge

~+#

key-labeled hyperedge

WLHyperEdge

~%+

weighted labeled hyperedge

WkLHyperEdge

~%#+

key-weighted labeled hyperedge

WLkHyperEdge

~%+#

weighted key-labeled hyperedge

WkLkHyperEdge

~%#+#

key-weighted key-labeled hyperedge

Directed hyperedges

DiHyperEdge

~>

directed hyperedge

WDiHyperEdge

~%>

weighted directed hyperedge

WkDiHyperEdge

~%#>

key-weighted directed hyperedge

LDiHyperEdge

~+>

labeled directed hyperedge

LkDiHyperEdge

~+#>

key-labeled directed hyperedge

WLDiHyperEdge

~%+>

weighted labeled directed hyperedge

WkLDiHyperEdge

~%#+>

key-weighted labeled directed hyperedge

WLkDiHyperEdge

~%+#>

weighted key-labeled directed hyperedge

WkLkDiHyperEdge

~%#+#>

key-weighted key-labeled directed hyperedge

Undirected edges

UnDiEdge

~

undirected edge

WUnDiEdge

~%

weighted undirected edge

WkUnDiEdge

~%#

key-weighted undirected edge

LUnDiEdge

~+

labeled undirected edge

LkUnDiEdge

~+#

key-labeled undirected edge

WLUnDiEdge

~%+

weighted labeled undirected edge

WkLUnDiEdge

~%#+

key-weighted labeled undirected edge

WLkUnDiEdge

~%+#

weighted key-labeled undirected edge

WkLkUnDiEdge

~%#+#

key-weighted key-labeled undirected edge

Directed edges

DiEdge

~>

directed edge

WDiEdge

~%>

weighted directed edge

WkDiEdge

~%#>

key-weighted directed edge

LDiEdge

~+>

labeled directed edge

LkDiEdge

~+#>

key-labeled directed edge

WLDiEdge

~%+>

weighted labeled directed edge

WkLDiEdge

~%#+>

key-weighted labeled directed edge

WLkDiEdge

~%+#>

weighted key-labeled directed edge

WkLkDiEdge

~%#+#>

key-weighted key-labeled directed edge

You saw the power of graph for Scala: the edges can be weighted and we may potentially construct a multigraph (key-labeled edges allow multiple edges for a pair of source and destination nodes).

If you run SBT on the preceding project with the Scala file in the src/main/scala directory, the output will be as follows:

[akozlov@Alexanders-MacBook-Pro chapter07(master)]$ sbt
[info] Loading project definition from /Users/akozlov/Src/Book/ml-in-scala/chapter07/project
[info] Set current project to Working with Graph Algorithms (in build file:/Users/akozlov/Src/Book/ml-in-scala/chapter07/)
> run
[warn] Multiple main classes detected.  Run 'show discoveredMainClasses' to see the list

Multiple main classes detected, select one to run:

 [1] org.akozlov.chapter07.ConstranedDAG
 [2] org.akozlov.chapter07.EnronEmail
 [3] org.akozlov.chapter07.InfluenceDiagram
 [4] org.akozlov.chapter07.InfluenceDiagramToJson

Enter number: 3

[info] Running org.akozlov.chapter07.InfluenceDiagram 
'Weather';'Vacation Activity';'Satisfaction';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Weather Forecast'~>'Vacation Activity' 'Decision
Directed: true
Acyclic: true
'Weather';'Vacation Activity';'Satisfaction';'Recommend to a Friend';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Satisfaction'~>'Recommend to a Friend' 'Probabilistic;'Weather Forecast'~>'Vacation Activity' 'Decision
Directed: true
Acyclic: true

If continuous compilation is enabled, the main method will be run as soon as SBT detects that the file has changed (in the case of multiple classes having the main method, SBT will ask you which one to run, which is not great for interactivity; so you might want to limit the number of executable classes).

I will cover different output formats in a short while, but let's first see how to perform simple operations on the graph.

Adding nodes and edges

First, we already know that the graph is directed and acyclic, which is a required property for all decision diagrams so that we know we did not make a mistake. Let's say that I want to make the graph more complex and add a node that will indicate the likelihood of me recommending a vacation in Portland, Oregon to another person. The only thing I need to add is the following line:

g += ("'Satisfaction'" ~+> "'Recommend to a Friend'")("Probabilistic")

If you have continuous compilation/run enabled, you will immediately see the changes after pressing the Save File button:

'Weather';'Vacation Activity';'Satisfaction';'Recommend to a Friend';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Satisfaction'~>'Recommend to a Friend' 'Probabilistic;'Weather Forecast'~>'Vacation Activity' 'Decision
Directed: true
Acyclic: true

Now, if we want to know the parents of the newly introduced node, we can simply run the following code:

println((g get "'Recommend to a Friend'").incoming)

Set('Satisfaction'~>'Recommend to a Friend' 'Probabilistic)

This will give us a set of parents for a specific node—and thus drive the decision making process. If we add a cycle, the acyclic method will automatically detect it:

g += ("'Satisfaction'" ~+> "'Weather'")("Cyclic")
println(g.mkString(";")) println("Directed: " + g.isDirected)
println("Acyclic: " + g.isAcyclic)

'Weather';'Vacation Activity';'Satisfaction';'Recommend to a Friend';'Weather Forecast';'Weather'~>'Weather Forecast' 'Forecast;'Weather'~>'Satisfaction' 'Deterministic;'Vacation Activity'~>'Satisfaction' 'Deterministic;'Satisfaction'~>'Recommend to a Friend' 'Probabilistic;'Satisfaction'~>'Weather' 'Cyclic;'Weather Forecast'~>'Vacation Activity' 'Decision
Directed: true
Acyclic: false

Note that you can create the graphs completely programmatically:

 var n, m = 0; val f = Graph.fill(45){ m = if (m < 9) m + 1 else { n = if (n < 8) n + 1 else 8; n + 1 }; m ~ n }

  println(f.nodes)
  println(f.edges)
  println(f)

  println("Directed: " + f.isDirected)
  println("Acyclic: " + f.isAcyclic)

NodeSet(0, 9, 1, 5, 2, 6, 3, 7, 4, 8)
EdgeSet(9~0, 9~1, 9~2, 9~3, 9~4, 9~5, 9~6, 9~7, 9~8, 1~0, 5~0, 5~1, 5~2, 5~3, 5~4, 2~0, 2~1, 6~0, 6~1, 6~2, 6~3, 6~4, 6~5, 3~0, 3~1, 3~2, 7~0, 7~1, 7~2, 7~3, 7~4, 7~5, 7~6, 4~0, 4~1, 4~2, 4~3, 8~0, 8~1, 8~2, 8~3, 8~4, 8~5, 8~6, 8~7)
Graph(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1~0, 2~0, 2~1, 3~0, 3~1, 3~2, 4~0, 4~1, 4~2, 4~3, 5~0, 5~1, 5~2, 5~3, 5~4, 6~0, 6~1, 6~2, 6~3, 6~4, 6~5, 7~0, 7~1, 7~2, 7~3, 7~4, 7~5, 7~6, 8~0, 8~1, 8~2, 8~3, 8~4, 8~5, 8~6, 8~7, 9~0, 9~1, 9~2, 9~3, 9~4, 9~5, 9~6, 9~7, 9~8)
Directed: false
Acyclic: false

Here, the element computation provided as the second parameter to the fill method is repeated 45 times (the first parameter). The graph connects every node to all of its predecessors, which is also known as a clique in the graph theory.

Graph constraints

Graph for Scala enables us to set constraints that cannot be violated by any future graph update. This comes in handy when we want to preserve some detail in the graph structure. For example, a Directed Acyclic Graph (DAG) should not contain cycles. Two constraints are currently implemented as a part of the scalax.collection.constrained.constraints package—connected and acyclic, as follows:

package org.akozlov.chapter07

import scalax.collection.GraphPredef._, scalax.collection.GraphEdge._
import scalax.collection.constrained.{Config, ConstraintCompanion, Graph => DAG}
import scalax.collection.constrained.constraints.{Connected, Acyclic}

object AcyclicWithSideEffect extends ConstraintCompanion[Acyclic] {
  def apply [N, E[X] <: EdgeLikeIn[X]] (self: DAG[N,E]) =
    new Acyclic[N,E] (self) {
      override def onAdditionRefused(refusedNodes: Iterable[N],
        refusedEdges: Iterable[E[N]],
        graph:        DAG[N,E]) = {
          println("Addition refused: " + "nodes = " + refusedNodes + ", edges = " + refusedEdges)
          true
        }
    }
}

object ConnectedWithSideEffect extends ConstraintCompanion[Connected] {
  def apply [N, E[X] <: EdgeLikeIn[X]] (self: DAG[N,E]) =
    new Connected[N,E] (self) {
      override def onSubtractionRefused(refusedNodes: Iterable[DAG[N,E]#NodeT],
        refusedEdges: Iterable[DAG[N,E]#EdgeT],
        graph:        DAG[N,E]) = {
          println("Subtraction refused: " + "nodes = " + refusedNodes + ", edges = " + refusedEdges)
        true
      }
    }
}

class CycleException(msg: String) extends IllegalArgumentException(msg)
object ConstranedDAG extends App {
  implicit val conf: Config = ConnectedWithSideEffect && AcyclicWithSideEffect
  val g = DAG(1~>2, 1~>3, 2~>3, 3~>4) // Graph()
  println(g ++ List(1~>4, 3~>1))
  println(g - 2~>3)
  println(g - 2)
  println((g + 4~>5) - 3)
}

Here is the command to run the program that tries to add or remove nodes that violate the constraints:

[akozlov@Alexanders-MacBook-Pro chapter07(master)]$ sbt "run-main org.akozlov.chapter07.ConstranedDAG"
[info] Loading project definition from /Users/akozlov/Src/Book/ml-in-scala/chapter07/project
[info] Set current project to Working with Graph Algorithms (in build file:/Users/akozlov/Src/Book/ml-in-scala/chapter07/)
[info] Running org.akozlov.chapter07.ConstranedDAG 
Addition refused: nodes = List(), edges = List(1~>4, 3~>1)
Graph(1, 2, 3, 4, 1~>2, 1~>3, 2~>3, 3~>4)
Subtraction refused: nodes = Set(), edges = Set(2~>3)
Graph(1, 2, 3, 4, 1~>2, 1~>3, 2~>3, 3~>4)
Graph(1, 3, 4, 1~>3, 3~>4)
Subtraction refused: nodes = Set(3), edges = Set()
Graph(1, 2, 3, 4, 5, 1~>2, 1~>3, 2~>3, 3~>4, 4~>5)
[success] Total time: 1 s, completed May 1, 2016 1:53:42 PM 

Adding or subtracting nodes that violate one of the constraints is rejected. The programmer can also specify a side effect if an attempt to add or subtract a node that violates the condition is made.

JSON

Graph for Scala supports importing/exporting graphs to JSON, as follows:

object InfluenceDiagramToJson extends App {

  val g = Graph[String,LDiEdge](("'Weather'" ~+> "'Weather Forecast'")("Forecast"), ("'Weather Forecast'" ~+> "'Vacation Activity'")("Decision"), ("'Vacation Activity'" ~+> "'Satisfaction'")("Deterministic"), ("'Weather'" ~+> "'Satisfaction'")("Deterministic"), ("'Satisfaction'" ~+> "'Recommend to a Friend'")("Probabilistic"))

  import scalax.collection.io.json.descriptor.predefined.{LDi}
  import scalax.collection.io.json.descriptor.StringNodeDescriptor
  import scalax.collection.io.json._

  val descriptor = new Descriptor[String](
    defaultNodeDescriptor = StringNodeDescriptor,
    defaultEdgeDescriptor = LDi.descriptor[String,String]("Edge")
  )

  val n = g.toJson(descriptor)
  println(n)
  import net.liftweb.json._
  println(Printer.pretty(JsonAST.render(JsonParser.parse(n))))
}

To produce a JSON representation for a sample graph, run:

[kozlov@Alexanders-MacBook-Pro chapter07(master)]$ sbt "run-main org.akozlov.chapter07.InfluenceDiagramToJson"
[info] Loading project definition from /Users/akozlov/Src/Book/ml-in-scala/chapter07/project
[info] Set current project to Working with Graph Algorithms (in build file:/Users/akozlov/Src/Book/ml-in-scala/chapter07/)
[info] Running org.akozlov.chapter07.InfluenceDiagramToJson 
{
  "nodes":[["'Recommend to a Friend'"],["'Satisfaction'"],["'Vacation Activity'"],["'Weather Forecast'"],["'Weather'"]],
  "edges":[{
    "n1":"'Weather'",
    "n2":"'Weather Forecast'",
    "label":"Forecast"
  },{
    "n1":"'Vacation Activity'",
    "n2":"'Satisfaction'",
    "label":"Deterministic"
  },{
    "n1":"'Weather'",
    "n2":"'Satisfaction'",
    "label":"Deterministic"
  },{
    "n1":"'Weather Forecast'",
    "n2":"'Vacation Activity'",
    "label":"Decision"
  },{
    "n1":"'Satisfaction'",
    "n2":"'Recommend to a Friend'",
    "label":"Probabilistic"
  }]
}
[success] Total time: 1 s, completed May 1, 2016 1:55:30 PM

For more complex structures, one might need to write custom descriptors, serializers, and deserializers (refer to http://www.scala-graph.org/api/json/api/#scalax.collection.io.json.package).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset