Let's get started by loading the packages:
- Load the package, and set the precision to float (by default, the GPU precision is set to a single digit):
library("gpuR") options(gpuR.default.type = "float")
- Matrix assignment to GPU:
# Assigning a matrix to GPU A<-matrix(rnorm(1000), nrow=10)
vcl1 = vclMatrix(A)
The output of the preceding command will contain details of the object. An illustration is shown in the following script:
> vcl1 An object of class "fvclMatrix" Slot "address": <pointer: 0x000000001822e180> Slot ".context_index": [1] 1 Slot ".platform_index": [1] 1 Slot ".platform": [1] "Intel(R) OpenCL" Slot ".device_index": [1] 1 Slot ".device": [1] "Intel(R) HD Graphics 530"
- Let's consider the evaluation of CPU vs GPU. As most of the deep learning will be using GPUs for matrix computation, the performance is evaluated by matrix multiplication using the following script:
# CPU vs GPU performance DF <- data.frame() evalSeq<-seq(1,2501,500) for (dimpower in evalSeq){ print(dimpower) Mat1 = matrix(rnorm(dimpower^2), nrow=dimpower) Mat2 = matrix(rnorm(dimpower^2), nrow=dimpower) now <- Sys.time() Matfin = Mat1%*%Mat2 cpu <- Sys.time()-now now <- Sys.time() vcl1 = vclMatrix(Mat1) vcl2 = vclMatrix(Mat2) vclC = vcl1 %*% vcl2 gpu <- Sys.time()-now DF <- rbind(DF,c(nrow(Mat1), cpu, gpu)) } DF<-data.frame(DF) colnames(DF) <- c("nrow", "CPU_time", "gpu_time")
The preceding script computes the matrix multiplication using CPU and GPU; the time is stored for different dimensions of the matrix. The output from the preceding script is shown in the following diagram:
Comparison between CPU and GPU
The graph shows that the computation effort required by CPUs increases exponentially with the CPU. Thus, GPUs help expedite it drastically.