Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Service Support and the Service Desk

In This Chapter

Seeing what a service desk does
Understanding event management
Measuring service desk performance

One of the fundamental truths of service management is that when you do it well, the service management team is like the wizard behind the curtain in the Land of Oz. If your e-mail never goes down and your technical equipment never fails, you don't go looking behind the curtain to understand what went wrong.

The reality is that services do fail and errors do occur − and when they do, customers (or service users) need to have their questions answered and problems resolved. Whatever a problem is, it must be reported, diagnosed, evaluated, and fixed quickly.

This chapter defines the service desk, describes its parts, and explains its activity.

Watching the Service Desk in Action . . . or Inaction

For many businesses, the service desk is the first port of call in customer interactions. Imagine the lost productivity and revenue, and the all-around chaos, that would occur if companies didn't have effective systems to manage IT service delivery and deal with problems effectively when they arose.

Suppose that you manage a retail store for Poor Service Corp. You have 10 point-of-sale (POS) systems, 15 phones on voice over Internet Protocol (VoIP), 5 customer kiosks, and several back-office PCs and servers. That's a lot of technology that needs to be monitored and managed. Unfortunately, the service desk at Poor Service Corp. is inconsistent and disorganized. Users have a service desk number to call if the POS system or phone doesn't work, but when a kiosk fails, they must call a different number for service, and if any of the PCs fails, they have yet another number to call. Nobody is quite sure why, but that's how things work.

Everyone in your store avoids calling the service desk whenever possible; the desk workers often get things wrong on the first try and take a very long time to resolved problems. This type of service desk is inconsistent. In fact, one person who works at the service desk is very knowledgeable, but if he isn't around to answer questions, the rest of the team members are a little lost.

Frustrated employees sometimes just move to a different POS unit when one fails, and by the time someone calls the service desk, an urgent problem has developed (such as the failure of several devices). Also, recent equipment failures indicate a lack of compliance with Payment Card Industry (PCI) − data security standards required by the credit card companies you work with.

Can you see where this scenario is heading? The service support process takes too long, costs too much, and leaves you providing poor service to your customers.

A properly functioning service desk does the following things consistently and quickly to meet customers' service expectations:

Diagnoses a problem correctly.
Evaluates how to fix the problem.
Fixes the problem.

Note

The ultimate goal of service management is to anticipate problems so that they never materialize and service desk calls are minimized.

Seeing How a Service Desk Works

A service desk provides a single point of contact for IT users and customers to report any issues they may have with the IT service (or, in some cases, with IT's customer service).

Lest you think that a service desk is a one-size-fits-all proposition, let us assure you that service desks come in many forms and styles:

The local service desk (departmental)
The consolidated service desk
The Software as a Service (SaaS) service desk model

The service desk may be merged with the customer service desk, for example. The first responder to any call may be a call-center worker who isn't employed directly by the company but who can respond to several common problems before passing the call on to the "real" service desk.

Goals of the service desk

A service desk has several objectives:

Problem resolution: First and foremost, the desk is there to help resolve issues and problems as quickly as possible. This task involves not only recognizing and resolving relatively simple issues, but also prioritizing problems that may have a greater impact. An outage at an insurance-company system that provides quotes to potential customers, for example, may take higher priority than a problem with the part of the company intranet that provides information about the employee discount program. The service desk has to know what's mission-critical and what isn't.
Service restoration: The service desk works to restore service as quickly as possible to maintain service-level agreements (SLAs). These SLAs often take some time to put in place and require a lot of negotiation. Therefore, a key service desk role is ensuring that the agreements are enforced to the best of the company's ability, which means tracking and monitoring service levels.
System support: The service desk provides system support, which includes dealing with any incidents and problems, and may also involve dealing with issues such as change and configuration management.

Note

Handling service desk issues takes a lot of activity. These processes include recording requests, assessing issues, routing requests, diagnosing and resolving the problem, tracking notification, and reporting, to name a few.

Functions of the service desk

Many service desks deal with issues beyond incident and problem reporting. What actually happens inside the service desk can be fairly sophisticated.

A fairly comprehensive service desk may offer the following set of functions:

Communication via multiple channels: The desk supports a wide variety of communication styles, including phone, e-mail, online forms, and even mobile communications. This communication is a two-way street: People can use the channels to report issues, and IT can use the channels to notify customers about the status and resolution of issues.
Incident and problem management: The desk supports the assessment, prioritization, resolution, notification, and reporting of small incidents or major problems. An incident becomes a problem when it happens more than a few times. Management includes recording, routing, and resolving an issue; notifying interested parties of the status of the issue; and reporting on the issue.
Change management: The desk supports the management of change requests, including information about how various parts of a system interact. Often, a system change actually causes an incident or a problem.
Configuration management: The desk supports mapping of IT resources to the business processes that they support. Configuration management often entails the use of a configuration management database, which we describe in the sidebar "Providing visibility into your company's infrastructure," later in this chapter.
Knowledge base: If service desk personnel don't have the right information to do their jobs, the jobs won't get done efficiently or effectively. Knowledge management ensures that people get the information that they need to do their jobs correctly. Service management systems often link to a database that stores information about past incidents and how they were resolved; this database speeds incident resolution.

Managing Events

The event management process involves three simple steps:

Event reporting
Problem diagnosis
Problem remediation and verification

We cover all these steps in detail in the following sections.

Reporting on events

The service desk receives notifications of events (issues) via phone, fax, e-mail, Web, and mobile devices or directly from automated monitoring capabilities deployed within IT systems. Then it attempts to solve the underlying problems, normally by passing details of the event to the support staff.

Figure 12-1 illustrates the processes by which incidents are reported to the service desk:

An incident report comes from either of two sources:
- Customers or employees
- Monitoring software that raises automatic reports
The service desk immediately resolves any well-known issues, such as lost passwords or PC hardware failures.
The vast majority of issues reported don't involve sophisticated diagnosis.
The service desk generates a trouble ticket summarizing everything that is known about the event
The problem moves into the diagnosis phase.

Diagnosing problems

Most service desks began as help desks that dealt reactively with incidents and problems, which remain important support issues. Two core service desk features are worth looking at, however.

Think of an incident as being an event that somehow interrupts or negatively affects the quality of a service. Front-line support staff can handle many relatively simple events, such as a printer failure or an employee's inability to reset his password, but they also have to determine when an event is a serious incident and deal with it effectively.

Figure 12-2 illustrates how a service desk typically manages and diagnoses incidents.

Figure 12.1. Incident reporting.

Figure 12.2. Incident management and diagnosis.

Consider an example that shows how the process works. Janice works at the service desk of an electronics company. When she finishes her coffee and logs into her incident management system in the morning, she sees a helpful user interface tailored for her job. What does she see? Typically, this screen displays all the incidents she's dealing with, alerts about outages, some status reports, and a function that allows her to create a new incident report.

When the phone rings, Janice picks it up. Someone in the customer service department says that his printer isn't working. Janice must create a new incident trouble ticket. Because this kind of incident happens all the time, her service desk has templates for it. She asks the user his name, and she pulls up information about him and the printer to which he has access. The information about these particular assets is stored in the all-important configuration management database (CMDB) so that she can actually see the type of printer that isn't working. (See the nearby sidebar "Providing visibility into your company's infrastructure" for more information on CMDBs.)

Because the customer service department gets high priority at her company, Janice wants to deal with the problem right away. She asks the user what the error message is; then she pulls up her knowledge base and accesses possible solutions to this particular problem. Voilà − one of these solutions works, and she's done with that particular issue.

Next, she gets an e-mail from someone in sales who says that one of the company's product-ordering Web pages is degrading steadily. Janice creates a trouble ticket for this problem. The system automatically generates a severity code of 1 (meaning that the problem has to be dealt with immediately) and routes the ticket to Sarah, who's part of the Web site engineering team. Then Janice goes back to reading reports on other open incidents.

Note

If multiple events of the same type occur, or if multiple events occur that appear to be related to the same underlying problem, in theory these events ought to fall into just one trouble ticket. In reality, however, the connection among events may be clear only to a subject-matter expert, so service desk staffers have to be aware of all open trouble tickets and their status.

Remediating and verifying problems

To continue the example from the preceding section, the trouble ticket goes to a specific support area where the problem is identified. After the problem has been fixed, its resolution is verified. Only at this point is the solution implemented.

Note

Providing visibility into your company's infrastructure

A configuration management database (CMDB; see Chapter 9) contains information about all of a company's assets that make up the information system infrastructure. These assets are often referred to as configuration items (CIs). These items may be servers, laptops, network elements, applications, and so on. In addition to this information, the CMDB may hold information about known errors, incidents, problems, changes, and release information. An important function of a CMDB is tracking changes in these items, because these changes can affect service.

The CMDB is an organization's information hub, holding all the relationships of system components. The idea of an information repository for information assets has been around for years, but with the growing importance of service management and the Information Technology Infrastructure Library (ITIL), it has gained more steam. (For details on ITIL, see Chapter 5.)

Remediation

Figure 12-3 illustrates various support areas that may get involved in the resolution of problems.

Figure 12.3. Remediation and verification.

Suppose that performance degradation has been reported in some application. The problem could stem from any of the following issues:

Configuration management: Someone made an error while changing a configuration.
Change management: An implemented software patch caused the problem.
Network: The network gets overloaded when California wakes up.
Desktop or device: Someone overloaded her PC, causing slow communications.
Database: A database table needs to be optimized.
System management: A server's processors failed.
IT security: A denial-of-service attack is in progress.
Application: A program has a bug.

The truth is that just about any area of service management can be involved in one way or another. Figure 12-3 indicates this possibility via the link from the CMDB to the integration infrastructure.

Suppose that Janice, on the service desk, passes a trouble ticket to Sarah, on the Web site engineering team. Sarah investigates the problem. She determines that the performance problem has to do with a server resource issue and that the server needs an upgrade (or that the Web site application needs a larger server). In the service management system, she notes the nature of the problem and the fact that this new server will be provisioned that day. She also updates the data on this problem and its solution in the knowledge base for future reference.

If someone in sales calls the service desk again, the desk will have a complete record of the whole event; it can report what the problem is and when it will be fixed.

Tip

Using the war-room technique

Occasionally, a problem comes up that stumps everyone. What's the cause? In this chapter's running example, the service desk employee knows who can best deal with the vast majority of problems, but she isn't sure exactly where a certain problem lies. Worse, the problem is a serious one that needs to be handled quickly because key service levels are being threatened.

Sometimes, performance problems are of that ilk. The problem may be in a database, in application software, in middleware, in the network, in server hardware − or in some combination of these elements. If the service desk simply passes the problem on to one of the teams that's responsible for one of these areas, that team may decide that the problem lies elsewhere and pass it back. The process can be repeated over and over, with the problem never being properly addressed.

The war room is designed to prevent such an outcome. It involves having members of all relevant support teams meet to form a short-lived group that resolves the problem collectively by analyzing it from every angle and determining a plan of action. Ideally, such a team can use the information-gathering capabilities of every team member.

Verification

The top of Figure 12-3 shows four processes involved in verification:

Governance and compliance
Provisioning
Testing environments
Backup and recovery

Sometimes, you know that a given action definitely solves a problem, as is often the case when hardware fails. The hardware is replaced, and the problem is solved immediately.

In the following circumstances, you must recover the application before implementing the solution:

When the support engineer doesn't know for sure that a given solution will resolve the problem − and could make the problem worse
When data has been corrupted

Note

Suppose that you can't solve a problem the way you thought you could. First, make sure that the right people do the following things with any changes:

Evaluate and authorize
Record
Test and validate

Warning

Standard processes are key in remediation and validation. If you don't have these processes in place, you'll never be able to keep track of anything. Also, the processes for remediation and verification are often defined as part of governance practice. (For more information on governance, refer to Chapter 10.) Make sure that you log everything you do to resolve a problem, including all attempts to solve it.

Tracking Service Key Performance Indicators

Note

It's important that services, even relatively unimportant ones, have defined service levels. If you look at a service level another way, it's a key performance indicator (KPI).

Change and configuration management

Very often, application performance failures are caused by recent changes, either in programs or in software or hardware configurations. Changes affect service levels, sometimes positively and sometimes negatively, and a change in one part of a system can easily affect downstream parts of the system (or even other systems).

Statistics suggest that many performance problems and system failures stem from errors made when configurations are changed. Consequently, change and configuration management are moving under the purview of the service desk.

The change management process ensures that standard procedures are used to handle all changes to prevent negative effects on service quality. Configuration management provides a logical model of the infrastructure or a service by identifying, controlling, maintaining, and verifying the configuration items. The idea is to understand the relationships among all the services that are part of the enterprise. That way, if one service has a problem, you have a good idea of how the problem may affect another service.

A nurse may need access to patient records around the clock for example, so the system that supports records management and delivery needs to meet this criterion. On the other hand, a human resources system that lets employees see how much they've spent for out-of-pocket medical expenses may not be as critical, so a 24/7 service level would be unlikely.

Negotiating SLAs is often a dance between IT and the business. Some service levels are non-negotiable, such as the mission-critical one outlined in the preceding paragraph; others have more wiggle room. IT and the business must work together to establish these SLAs.

Note

Typical SLAs include the following:

Response times (possibly varying by transaction)
Availability on any given day
Overall uptime target
Agreed-on response times and procedures in the event that a service goes down

Service-level metrics

When the agreements are in place, you must manage and track them. These service-level metrics or KPIs are stored in either of two places:

The SLA management system
The CMDB

Tip

Some service desk systems can link to the availability monitoring systems to confirm that mission-critical systems are identified and monitored as part of the SLA management process. This system helps IT prioritize issues and ensure that the right resources are allocated. Often, this monitoring can be done at the individual user level. The service-level management system can also link to the incident, problem, change, and configuration management systems to provide visibility into these functions. Typically, these systems also provide reports that outline certain SLA metrics for end users.

Service desk metrics

The KPIs for the service desk itself are expressed in terms of problem resolution. Ideally, incidents are classified according to type, and three specific times are recorded:

Time to identify problem: In some circumstances, a problem may exist for a long time before it is reported, indicating that monitoring systems may need to be reviewed.
Time to diagnose: This metric is the time between an event report and the identification of the cause of the problem.
Time to fix: This metric is the time between diagnosis and system repair or resumption of service.

The analysis of the performance of the service desk and the support teams against these KPIs needs to be carried out intelligently. Culturally, it is important to encourage employees to report incidents and to continually improve the process of managing problems.

Also, comparing one month with another may not be comparing like with like. If everyone's using a new version of an operating system, for example, the number of incidents at the service desk may rise simply because of the operating-system change. Yet it may not be possible to revert to the older software (because it's no longer supported, for example).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 12. Service Support and the Service Desk

Create new playlist

Sign In

Sign Up

Chapter 12. Service Support and the Service Desk

Watching the Service Desk in Action . . . or Inaction

Note

Seeing How a Service Desk Works

Goals of the service desk

Note

Functions of the service desk

Managing Events

Reporting on events

Diagnosing problems

Note

Remediating and verifying problems

Note

Remediation

Tip

Verification

Note

Warning

Tracking Service Key Performance Indicators

Note

Note

Service-level metrics

Tip

Service desk metrics

Table of Contents for
12. Service Support and the Service Desk