Building a RShiny application

Our RShiny application will have the following features:

  • Load a transaction file
  • Calculate the product pairs and their transaction frequency, and display them
  • Display the discovered communities from the product pairs dataset

Let us look at the user interface code:

ui <- fluidPage(
navbarPage("Product Pairs",
tabPanel("Transactions"
, fileInput("datafile", "select transactions csv file",
accept = c(
"text/csv",
"text/comma-separated-values,text/plain",
".csv"
)
)
, dataTableOutput("transactions")
),
tabPanel("Product Pairs"
,dataTableOutput("ppairs")),
tabPanel("Community"
,plotOutput("community"))
)
)

We have three panels. In the first panel, we select a product transaction file and display it. In our second panel, we show the product pairs and their transaction counts. In the final panel, we display the communities we have discovered.

Let us look at the server-side code:

server <- function(input, output) {


trans.obj <- reactive({
data <- input$datafile
transactions.obj <- read.transactions(file = data$datapath, format = "single",
sep = ",",
cols = c("order_id", "product_id"),
rm.duplicates = FALSE,
quote = "", skip = 0,
encoding = "unknown")
transactions.obj


})

trans.df <- reactive({

data <- input$datafile
if(is.null(data)){return(NULL)}
trans.df <- read.csv(data$datapath)
return(trans.df)
})

network.data <- reactive({
transactions.obj <- trans.obj()
support <- 0.015

# Frequent item sets
parameters = list(
support = support,
minlen = 2, # Minimal number of items per item set
maxlen = 2, # Maximal number of items per item set
target = "frequent itemsets"
)

freq.items <- apriori(transactions.obj, parameter = parameters)

# Let us examine our freq item sites
freq.items.df <- data.frame(item_set = labels(freq.items)
, support = freq.items@quality)
freq.items.df$item_set <- as.character(freq.items.df$item_set)

# Clean up for item pairs
library(tidyr)
freq.items.df <- separate(data = freq.items.df, col = item_set, into = c("item.1", "item.2"), sep = ",")
freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\{', replacement='')
freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\}', replacement='')

# Prepare data for graph
network.data <- freq.items.df[,c('item.1','item.2','support.count')]
names(network.data) <- c("from","to","weight")
return(network.data)

})

output$transactions <- renderDataTable({
trans.df()
})
output$ppairs <- renderDataTable({

network.data()

})

output$community <- renderPlot({
network.data <- network.data()
my.graph <- graph_from_data_frame(network.data)
random.cluster <- walktrap.community(my.graph)
plot(random.cluster,my.graph,
layout=layout.fruchterman.reingold,
vertex.label.cex=.5,
edge.arrow.size=.1,height = 1200, width = 1200)
})


}

Let us first look at the reactive expressions.

Let us look at trans.df:

  trans.df <- reactive({

data <- input$datafile
if(is.null(data)){return(NULL)}
trans.df <- read.csv(data$datapath)
return(trans.df)
})

As soon as a file is uploaded, this file is read into a data frame, trans.df.

Let us look at trans.obj:

 trans.obj <- reactive({
data <- input$datafile
transactions.obj <- read.transactions(file = data$datapath, format = "single",
sep = ",",
cols = c("order_id", "product_id"),
rm.duplicates = FALSE,
quote = "", skip = 0,
encoding = "unknown")
transactions.obj


})

The uploaded file is used to create a transaction.object.

Let us look at network.data:

network.data <- reactive({
transactions.obj <- trans.obj()
support <- 0.015

# Frequent item sets
parameters = list(
support = support,
minlen = 2, # Minimal number of items per item set
maxlen = 2, # Maximal number of items per item set
target = "frequent itemsets"
)

freq.items <- apriori(transactions.obj, parameter = parameters)

# Let us examine our freq item sites
freq.items.df <- data.frame(item_set = labels(freq.items)
, support = freq.items@quality)
freq.items.df$item_set <- as.character(freq.items.df$item_set)

# Clean up for item pairs
library(tidyr)
freq.items.df <- separate(data = freq.items.df, col = item_set, into = c("item.1", "item.2"), sep = ",")
freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\{', replacement='')
freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\}', replacement='')

# Prepare data for graph
network.data <- freq.items.df[,c('item.1','item.2','support.count')]
names(network.data) <- c("from","to","weight")
return(network.data)

})

Using the transaction object, the apriori function is invoked to get the product pairs. The output of the apriori is carefully formatted and a final data frame, network.data, is created.

The rest of the functions in the server renders these outputs to the respective slots in the UI.

Let us look at what the application looks like when started:

Using the file selector, we can select the transaction file. The transaction file should be a .csv file with two columns. One for the order_id and the other one for the product_id. In this version, we need to maintain the column names as order_id and product_id.

Once selected let us see how the screen changes:

We see the orders and the products.

Let us move to the next tab product pairs:

We see the product pairs and their transaction count.

Finally, let us look at the last tab, the community graph:

Our product community graph is displayed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset