Building a RShiny application

Our RShiny application will have the following features:

Load a transaction file
Calculate the product pairs and their transaction frequency, and display them
Display the discovered communities from the product pairs dataset

Let us look at the user interface code:

ui <- fluidPage(
  navbarPage("Product Pairs",
             tabPanel("Transactions"
                      , fileInput("datafile", "select transactions csv file",
                                  accept = c(
                                    "text/csv",
                                    "text/comma-separated-values,text/plain",
                                    ".csv"
                                    )
                      )
                      , dataTableOutput("transactions")
             ),
             tabPanel("Product Pairs"
                      ,dataTableOutput("ppairs")),
             tabPanel("Community"
                      ,plotOutput("community"))
  )
)

We have three panels. In the first panel, we select a product transaction file and display it. In our second panel, we show the product pairs and their transaction counts. In the final panel, we display the communities we have discovered.

Let us look at the server-side code:

server <- function(input, output) {
  
  
  trans.obj <- reactive({
    data <- input$datafile
    transactions.obj <- read.transactions(file = data$datapath, format = "single", 
                                          sep = ",",
                                          cols = c("order_id", "product_id"), 
                                          rm.duplicates = FALSE,
                                          quote = "", skip = 0,
                                          encoding = "unknown")
    transactions.obj
    
    
  })
  
  trans.df <- reactive({
    
    data <- input$datafile
    if(is.null(data)){return(NULL)}
    trans.df <- read.csv(data$datapath)
    return(trans.df)
  })
  
  network.data <- reactive({
    transactions.obj <- trans.obj()
    support    <- 0.015
    
    # Frequent item sets
    parameters = list(
      support = support,
      minlen  = 2,  # Minimal number of items per item set
      maxlen  = 2, # Maximal number of items per item set
      target  = "frequent itemsets"
    )
    
    freq.items <- apriori(transactions.obj, parameter = parameters)
    
    # Let us examine our freq item sites
    freq.items.df <- data.frame(item_set = labels(freq.items)
                                , support = freq.items@quality)
    freq.items.df$item_set <- as.character(freq.items.df$item_set)
    
    # Clean up for item pairs
    library(tidyr)
    freq.items.df <- separate(data = freq.items.df, col = item_set, into = c("item.1", "item.2"), sep = ",")
    freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\{', replacement='')
    freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\}', replacement='')
    
    # Prepare data for graph
    network.data <- freq.items.df[,c('item.1','item.2','support.count')]
    names(network.data) <- c("from","to","weight")
    return(network.data)
    
  })
  
  output$transactions <- renderDataTable({
    trans.df()
  })
  output$ppairs <- renderDataTable({
    
    network.data()
    
  })
  
  output$community <- renderPlot({
    network.data <- network.data()
    my.graph <- graph_from_data_frame(network.data)
    random.cluster <- walktrap.community(my.graph)
    plot(random.cluster,my.graph,
         layout=layout.fruchterman.reingold,
         vertex.label.cex=.5,
         edge.arrow.size=.1,height = 1200, width = 1200)
  })

  
}

Let us first look at the reactive expressions.

Let us look at trans.df:

  trans.df <- reactive({
    
    data <- input$datafile
    if(is.null(data)){return(NULL)}
    trans.df <- read.csv(data$datapath)
    return(trans.df)
  })

As soon as a file is uploaded, this file is read into a data frame, trans.df.

Let us look at trans.obj:

 trans.obj <- reactive({
    data <- input$datafile
    transactions.obj <- read.transactions(file = data$datapath, format = "single", 
                                          sep = ",",
                                          cols = c("order_id", "product_id"), 
                                          rm.duplicates = FALSE,
                                          quote = "", skip = 0,
                                          encoding = "unknown")
    transactions.obj
    
    
  })

The uploaded file is used to create a transaction.object.

Let us look at network.data:

network.data <- reactive({
    transactions.obj <- trans.obj()
    support    <- 0.015
    
    # Frequent item sets
    parameters = list(
      support = support,
      minlen  = 2,  # Minimal number of items per item set
      maxlen  = 2, # Maximal number of items per item set
      target  = "frequent itemsets"
    )
    
    freq.items <- apriori(transactions.obj, parameter = parameters)
    
    # Let us examine our freq item sites
    freq.items.df <- data.frame(item_set = labels(freq.items)
                                , support = freq.items@quality)
    freq.items.df$item_set <- as.character(freq.items.df$item_set)
    
    # Clean up for item pairs
    library(tidyr)
    freq.items.df <- separate(data = freq.items.df, col = item_set, into = c("item.1", "item.2"), sep = ",")
    freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\{', replacement='')
    freq.items.df[] <- lapply(freq.items.df, gsub, pattern='\}', replacement='')
    
    # Prepare data for graph
    network.data <- freq.items.df[,c('item.1','item.2','support.count')]
    names(network.data) <- c("from","to","weight")
    return(network.data)
    
  })

Using the transaction object, the apriori function is invoked to get the product pairs. The output of the apriori is carefully formatted and a final data frame, network.data, is created.

The rest of the functions in the server renders these outputs to the respective slots in the UI.

Let us look at what the application looks like when started:

Using the file selector, we can select the transaction file. The transaction file should be a .csv file with two columns. One for the order_id and the other one for the product_id. In this version, we need to maintain the column names as order_id and product_id.

Once selected let us see how the screen changes:

We see the orders and the products.

Let us move to the next tab product pairs:

We see the product pairs and their transaction count.

Finally, let us look at the last tab, the community graph:

Our product community graph is displayed.

Table of Contents for Building a RShiny application

Create new playlist

Sign In

Sign Up

Table of Contents for
Building a RShiny application