Personnel productivity breakdown

We begin the analysis of each office's teams by investigating their overall compositions and actions. We have a variety of metrics that may help us understand why one team may perform better than another. The following is a list of common metrics that we can use in our HR perspective:

  • Age distribution
  • Salary distribution
  • Employee-retention rate
  • Employee sick and vacation days
  • Employee training and performance

Age distribution

Let's begin with our analysis and compare the age distribution between the two offices. Instead of using a histogram, we use a frequency polygon so that we can compare more than one distribution in the same chart.

Age distribution

Exercise 17.3

  1. Create the following variable:

    Variables

    Details

    Label

    Value

    vEmployeeAgeBinSize

    1

  2. Let's create the following line chart:

    Dimensions

    Details

    Label

    Value

    Age

    =ValueLoop($(=floor(min({$<_Employee_Active_Flag={1}>}
                   [Employee Age]),vEmployeeAgeBinSize)),$(=floor(max({$<_Employee_Active_Flag={1}>}
                   [Employee Age]),vEmployeeAgeBinSize)),vEmployeeAgeBinSize
    )

    Employee Current Office

    Employee Current Office

    Expressions

    Details

    Label

    Value

    Number of Employees

    sum({$<_Employee_Active_Flag={1}>} 
        if(floor([Employee Age],vEmployeeAgeBinSize)=ValueLoop(
              $(=floor(min({$<_Employee_Active_Flag={1}>}[Employee Age]),vEmployeeAgeBinSize)),$(=floor(max({$<_Employee_Active_Flag={1}>} 
                 [Employee Age]),vEmployeeAgeBinSize)),vEmployeeAgeBinSize
            ),1,0)) 
  3. In the Expressions tab, enable the Polynomial of 2nd degree option in the Trendlines section.
  4. In the Presentation tab, disable the Suppress Zero-Values option.
  5. In the Axes tab, enable the Continuous option.
  6. Create an Input Box to edit the vEmployeeAgeBinSize variable.

Normally, we would use the class() function to create a histogram; however, the class() function generates bins from existing values. A bin without any corresponding value behaves in the same way that missing or null values behaved in Chapter 12, Sales Perspective. Although we are not affected by this behavior when we use a bar chart, we are not so lucky when we work with a line chart. The line jumps from one existing bin directly to another without representing the missing bins as zero.

Therefore, we use the valueloop() function rather than the class() function because valueloop() is a list of numbers that have no relationship with the data. We define the range of valueloop() dynamically with the min() and max() functions and the bin size is determined by the value of the vEmployeeAgeBinSize variable. The floor() function helps to round down Employee Age so that we can assign it to the proper bin. We usually use floor() to round down to the nearest one, but we can also use it to round down to the nearest five, ten, or thousand.

Finally, we add a second degree polynomial trend line to get a general idea of the shape of the distributions. As both resemble a normal distribution, this trend line does give us a general idea of the distribution. We can observe that the Wisconsin office in general has a greater number of young employees and fewer middle-aged ones. If the distribution does not resemble a normal distribution, we can try using trend lines of different degrees or, more simply, enlarge the bin size. Notice how different the distributions look without the trend lines and with an Employee Age Bin Size of 5:

Exercise 17.3

Salary distribution

The next analysis entails comparing how well each office pays their employees. We can analyze salary distribution among various groupings, such as office, job function, age, gender, and performance. In the following exercise, we are going to compare how each office pays each job function. Along with the option to use a frequency polygon trellis chart to compare the Salary Distribution by Job Function and Office, we can also use the following box plot chart:

Salary distribution

Exercise 17.4

  1. Let's create the following combo chart:

    Dimensions

    Details

    Label

    Value

    Employee Current Job Title

    Employee Current Job Title

    Employee Current Office

    Employee Current Office

    Expressions

    Label

    Value

    Lower Quartile

    Fractile(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Current Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    , 0.5)
    -Fractile(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Current Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    , 0.25) 

    Upper Quartile

    Fractile(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    , 0.75)
    -
    Fractile(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    , 0.5)

    Dummy Expression

    =0

  2. In the Expressions tab, enable the Bar option for every expression and the Has Error Bars option for Lower Quartile in the Display Options section.
  3. Define the following attribute expression for Lower Quartile:

    Attribute Expressions

    Label

    Value

    Background Color

    if([Employee Current Office]='Iowa'
       ,ARGB(100,178,171,210)
       ,ARGB(100,253,184,99)
    )

    Bar Offset

    Fractile(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Current Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    , 0.25)

    Error Below

    ([Lower Quartile])- 
    Min(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Current Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    ) 

    Error Above

    -([Lower Quartile]) + 
    Max(
        Aggr(only({$<_Employee_Active_Flag={1}>} 
            [Employee Current Salary]),[Employee Current Office],
            [Employee ID],[Employee Current Job Title])
    )
  4. Define the following attribute expression for Upper Quartile:

    Attribute Expressions

    Label

    Value

     
    if([Employee Current Office]='Iowa'
       ,ARGB(200,178,171,210)
       ,ARGB(200,253,184,99)
    )

    Background Color

  5. In the Error Bars section of the Presentation tab, change Width to Narrow, Thickness to Medium, and Color to a dark gray.
  6. In the Axes tab, define the Static Max with the following code:
    Max(
    Aggr(only({$<_Employee_Active_Flag={1}>} 
    [Employee Current Salary]),[Employee Current Office]
    ,[Employee ID],[Employee Current Job Title])
    )*1.1 

We could create a simple box plot using Box Plot Wizard in the file menu, Tools; however, in this case, we are limited to using one dimension. As a workaround, we use a combo chart with bars and error bars to create a box plot. As a part of this technique, we have to consider the fact that QlikView will draw the same number of error bars as there are expressions. As we have three distinct values in the Employee Current Office field, we add a third, dummy expression that is not visible because the value is zero. If the second dimension has five distinct values, we create a total of three dummy expressions.

Although this trick may seem awkward at first, the chart is not effective when the second dimension has a large number of distinct values. Therefore, we never expect to choose one with more than five or so values and it is not much trouble to create three dummy expressions.

The last adjustment involves the maximum value of the expression axis scale. The maximum that QlikView automatically calculates is unnecessarily large. So, we dynamically calculate the maximum plus a ten percent cushion. This makes it easier to compare the different box plots.

Employee retention rate

As a high-employee turnover can affect a team's productivity, we analyze how many employees leave each month. We also analyze how many people we hire and how our team evolves as they accumulate more years of experience. Along with using histograms, frequency polygons, and box plots to visualize the distribution of experience, we can also show a summarized distribution over time with a stacked bar chart.

Employee retention rate

Exercise 17.5

  1. Let's create the following bar chart:

    Dimensions

    Label

    Value

    Year-Month

    Year-Month

    Expressions

    Label

    Value

    Dismissals

    -sum(Dismissal)

    3+ Years Experience

    count({$<[Employee Tenure in Months]={">36"}>} 
        distinct [Headcount Counter]) 

    2-3 Years Experience

    count({$<[Employee Tenure in Months]={">24<=36"}>}      
        distinct [Headcount Counter]) 

    1-2 Years Experience

    count({$<[Employee Tenure in Months]={">12<=24"}>}      
        distinct [Headcount Counter]) 

    0-1 Year Experience

    count({$<[Employee Tenure in Months]={">0<=12"}>}      
        distinct [Headcount Counter]) 

    New Hires

    sum(Hiring) 
  2. In the Style tab, enable the Stacked option in the Subtype section.

This chart is simple to create because much of the work is done in the script. We create a new row in the fact table for every dismissal and new hire. The Dismissal and Hiring fields contain the value, 1, so that we only have to sum them up to discover the total occurrences of each event. Also, every time we add to the employee payroll event, we calculate how many months have passed since they started working for the company.

Employee vacation and sick days

When our employees take too little vacation and too many sick days, this may indicate an overstressed team who may not be as productive as they could be. We can analyze these events with a bar or line chart, but if we want to visualize them on a daily level, it may be more insightful to use a calendar heat map. This heat map was inspired by Julian Villafuerte's blog post at https://qlikfreak.wordpress.com/2014/03/09/heat-map/.

Exercise 17.6

  1. Let's create the following pivot table chart:

    Dimensions

    Label

    Value

    Month

    Month

    Week

    Week

    <empty>

    Year

    <empty>

    =if(wildmatch(Weekday,'s*','t*')
          ,left(Weekday,2),left(Weekday,1)) 

    Expressions

    Label

    Value

    % Absent

    sum({$<[Absence Type]={'Sick Day','Unexcused'}>} Absence)
    /
    (count(Total <Month,Year> distinct [Headcount Counter])
    * 
    max(Total <Month,Year> Day))
  2. Pivot this table's dimensions as shown in the following figure:
    Exercise 17.6
  3. In the Background Color expression attribute, go through the Colormix Wizard found in the File menu. Use the same formula in the value expression as we did in the metric expression.
  4. Copy the formula generated by Colormix Wizard and paste it in the Text Color expression attribute.
  5. In the Presentation tab, disable the Allow Pivoting option and enable the Always Fully Expanded option. Replace the dashes in Null Symbol and Missing Symbols with spaces.
  6. In the Style tab, disable the Vertical Dimension Cell Borders option.
  7. Open the Custom Format Cell dialog window and make all backgrounds and borders white.

Sick days will tend to increase as the team grows; so, instead of basing our heat map on the actual number of sick days, we calculate sick days as a percentage of the total number of employee working days. According to the chart, after a lull in sick days in the beginning of 2015, there seems to have been an increase in the later part of the year. We can also compare the different offices by adding [Employee Current Office] as a dimension and placing it above the Year dimension.

Employee training and performance

Our final break down of employee productivity is to analyze our employee training and the results of this training. We expect employees to have greater success in the company after they are trained; however, sometimes it is inevitable that they leave or transfer to another office. Along with analyzing whether employees stay with the company after training, we also take advantage of the SCD Type 2 in our HR data model to analyze how often employees transfer or earn promotions after their training.

Employee training and performance

Exercise 17.7

Let's create the following bar chart:

Dimensions

 

Label

Value

Employee Historical Office

Employee Historical Office

Expressions

 

Label

Value

Dismissal

-sum({$<_Employee_Active_Flag={0}>} Training) 

Promotion / Transfer

-count({$<Training={1},_Employee_Active_Flag={1}>} 
distinct 
   if([Employee Historical Job Title] 
           <> [Employee Current Job Title]and [Employee Historical Office]
           <> [Employee Current Office],[Employee ID]))

Transfer

-count({$<Training={1},_Employee_Active_Flag={1}>} 
distinct 
   if([Employee Historical Job Title] 
           = [Employee Current Job Title]and [Employee Historical Office]
           <> [Employee Current Office],[Employee ID])) 

Active w/o Change

sum({$<_Employee_Active_Flag={1}>} Training)-[Promotion]+[Promotion / Transfer]+[Transfer]

Promotion

count({$<Training={1},_Employee_Active_Flag={1}>} 
distinct 
   if([Employee Historical Job Title] 
           <> [Employee Current Job Title]and [Employee Historical Office]
           = [Employee Current Office],[Employee ID]))

The Training={1} set analysis filters the data model, so we only see each employee's historical job title and office at the time he or she was trained. In this data model, we constantly update the current job title and office in every employee record so that we can always compare it with the historical records.

We use an if-statement within the count() function in this example in order to highlight how to compare fields of different SCD types. If we are dealing with a large amount of data, then we migrate this if-statement to the script and create a flag in the data model that indicates which dimension attributes have changed. We can then use this flag in the expression's set analysis.

In the chart, we observe that both offices suffer from a large number of dismissals after training employees and employees in the Wisconsin office tend to transfer when they earn a promotion. This example demonstrates why it is important to understand how the data model handles slowly changing dimensions and how we use both SCD Type 1 and SCD Type 2 to create an insightful analysis.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset