Symbolic programming in MXNet

Let us do some symbolic declaration:

> a <- mx.symbol.Variable("a")
> a
C++ object <0x11dea3e00> of class 'MXSymbol' <0x10c0a79b0>
> b <- mx.symbol.Variable("b")
> b
C++ object <0x10fecf330> of class 'MXSymbol' <0x10c0a79b0>
> c <- a + b
> c
C++ object <0x10f91bba0> of class 'MXSymbol' <0x10c0a79b0>
>

As you can see, they are MXSymbol objects.

We need an executor to supply data to it and to get the results:

> arg_lst <- list(symbol = c, ctx = mx.ctx.default(), a = dim(ones.matrix),
+ b= dim(ones.matrix), grad.req="null")
> pexec <- do.call(mx.simple.bind, arg_lst)
> pexec
C++ object <0x11d852c40> of class 'MXExecutor' <0x101be9c30>
> input_list <- list(a = ones.matrix,b = ones.matrix)
> mx.exec.update.arg.arrays(pexec, input_list)
> mx.exec.forward(pexec)
> pexec$arg.arrays
$a
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1

$b
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1

> pexec$outputs
$`_plus4_output`
[,1] [,2] [,3]
[1,] 2 2 2
[2,] 2 2 2
[3,] 2 2 2

We create an executor by calling mx.simple.bind for our symbol c. In order to create it, we need to tell the executor the shape of a and b, which are arguments to c. After that, using mx.exec.update.arg.arrays, we push the real data into the executor for it to execute the symbol c. The output slot of the executor has the results stored.

Let us look at another example, where we create a symbol d, which is a dot product of two matrices:

> d <- mx.symbol.dot(a, b)
> arg_lst <- list(symbol = d, ctx = mx.ctx.default(), a = dim(ones.matrix),
+ b= dim(ones.matrix), grad.req="null")
> pexec <- do.call(mx.simple.bind, arg_lst)
> pexec
C++ object <0x1170550a0> of class 'MXExecutor' <0x101be9c30>
> input_list <- list(a = ones.matrix,b = ones.matrix)
> mx.exec.update.arg.arrays(pexec, input_list)
> mx.exec.forward(pexec)
> pexec$arg.arrays
$a
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1

$b
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
[3,] 1 1 1

> pexec$outputs
$dot3_output
[,1] [,2] [,3]
[1,] 3 3 3
[2,] 3 3 3
[3,] 3 3 3

Again, we bind an executor using mx.simple.bind to c. Using mx.exec.update.arg.arrays, we provide the actual data to a and b. Finally, using mx.exec.forward, we execute the executor to get the results.

Refer to https://github.com/apache/incubator-mxnet/tree/master/R-package/vignettes for R vignettes and other operations using MXNet.

With this background in imperative and symbolic use of MXNet R, let us go ahead and build a simple multi-layer perceptron to solve the famous XOR gate problem. We want the network to learn the XOR truth table:

X Y Z
0 0 0
0 1 1
1 0 1
1 1 0

Our network architecture is as follows:

Let us generate some training data:

#############   XOR Learning  ############
mx.set.seed(1)

### Generate some data.
x.1 <- mx.nd.sample.normal(mu = mx.nd.array(c(0,0)),
sigma = mx.nd.array(c(0.001,0.001)), shape = (1000))
y.1 <- rep(0, 1000)
x.2 <- mx.nd.sample.normal(mu = mx.nd.array(c(0,1)),
sigma = mx.nd.array(c(0.001,0.001)), shape = (1000))
y.2 <- rep(1,1000)
x.3 <- mx.nd.sample.normal(mu = mx.nd.array(c(1,0)),
sigma = mx.nd.array(c(0.001,0.001)), shape = (1000))
y.3 <- rep(1,1000)
x.4 <- mx.nd.sample.normal(mu = mx.nd.array(c(1,1)),
sigma = mx.nd.array(c(0.001,0.001)), shape = (1000))
y.4 <- rep(0,1000)

X <- data.matrix(mx.nd.concat(list(x.1,x.2,x.3,x.4)) )
Y <- c(y.1,y.2,y.3,y.4)

We have used four normal distributions to generate our data points. Finally, we combine them into one array, X. The labels are stored in the Y variable. 

Now let us go ahead and define our network architecture to solve our XOR problem:

############## Define the Network #########

# Input layer
data <- mx.symbol.Variable("data")

# Hidden Layer
hidden.layer <- mx.symbol.FullyConnected(data = data
, num_hidden = 2)

# Hidden Layer Activation
act <- mx.symbol.Activation(data = hidden.layer, act_type = "relu")

# Output Layer
out.layer <- mx.symbol.FullyConnected(data = act
, num.hidden = 2)

# Softmax of output
out <- mx.symbol.SoftmaxOutput(out.layer)

We have created two hidden layers using mx.symbol.FullyConnected. The num_hidden parameter is where we specify the number of hidden layers.

We use relu activation for this layer. mx.symbol.Activation is used to create this activation. We pass the previous hidden layer to this to say this activation function is tied to the previous hidden layer.

Our output layer is defined by mx.symbol.FullyConnected has two nodes. The activation for our output layer is a softmax activation layer defined by mx.symbol.SoftmaxOutput.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset