In Chapter 4, “Planning for Business Governance,” we discussed the importance of governance planning for your business, which ensures that the most important part of your SharePoint environment—the content—stays in good shape over the life of your deployment.
This chapter, on the other hand, covers the governance of the underlying infrastructure—the actual SharePoint technical platform. This is especially important if you’re managing your SharePoint farm(s) in-house, since you’ll need to ensure uptime yourself. If you happen to be using Office 365 or SharePoint Online, Microsoft is doing most of the infrastructure governance for you, but you’ll still need to manage some aspects of your environment.
There are two key parts to SharePoint infrastructure governance: operations and applications. Operational governance covers the maintenance of the servers, software, and backups of the data. Application governance covers the management of the customizations made to your SharePoint environment. Both are very important factors in ensuring that your SharePoint offering remains healthy over time.
An analogy we use frequently is that of governing cars and transportation through a “driving and roadway system.” Business governance would focus on training drivers through driver’s education and licensure, making sure they understand the “rules of the road” such as driving on the correct side of the road and stopping at intersections. Extending the analogy, operational governance would be ensuring that our infrastructure—such as roads and bridges—is maintained on a regular basis. And application governance would align with how we’d plan for getting new cars or adding lanes to existing roads (you can’t just shut down a road for weeks on end—you’ve got to come up with a plan that accommodates future growth but does not inhibit existing usage). In all cases, the purpose of the governance plan that you come up with is to manage risk.
SharePoint 2013 adds a new dimension to both operational and application governance. First, with the cloud finally being a viable option with near-parity to what you can do on-premises, you’ll need to consider the impact of both cloud and hybrid deployments on your governance approach. Next, with the new SharePoint 2013 apps model, the way you manage customizations and add-ons to your environment will change as well. Specifically, here are some of the new concepts that you will want to think about with respect to infrastructure governance:
Operational cloud considerations. Choosing SharePoint 2013 in the cloud is now a viable option. This means that the role of a SharePoint administrator will likely shift from someone who manages the entire farm and underlying infrastructure to someone who manages the service applications and site collections in the farm. Or, you may be using a hybrid model (which is very likely if you’re using Yammer for social features), where the SharePoint administrator manages both areas. Modeling your operational governance for these changes—and having the right staff members on the team—will be increasingly important.
Device options. The ever-increasing list of devices supported by SharePoint could mean additional governance considerations for the various operating systems, browsers, and applications that you now must support. Gone are the days of dictating that the only supported combination is Windows 7 running Internet Explorer 8 and Office 2010. With additional Macs, iPads, iPhones, browser types, Office versions (both rich-client and Web-based), and accessibility modes, governing how and when to add or remove device support or upgrade to a CU will get more complex.
Apps model. SharePoint 2013 has introduced a new app model for building and governing additional functionality that runs in the context of SharePoint. Understanding how to manage solutions in the new “apps” model is important, since full-trust and sandboxed solutions are still viable options. Not only must you determine who may build and/or install solutions, you must also govern the way in which they may build and/or install them.
The key to operational governance is simple: keep SharePoint up and running as a viable and dependable service for users. In general, this type of IT governance should be very familiar to IT pros, as it is similar to other applications in a server farm—Exchange, Lync, Windows, Active Directory (AD), or even SQL Server. That said, there are special considerations for SharePoint that are important and are discussed here.
When planning your SharePoint deployment, you have a number of choices. Will you provide a central offering to your business? Will you use a single farm for everything? Will the cloud play a role? Who will administer the central farm(s), service applications, and site collections?
The key is to find the right balance for your organization (see Figure 5-1). Typically, a centrally managed offering, where software, services, and sites are hosted and managed centrally by a core IT group, is the most common type. However, some organizations allow various groups to manage their own farms locally or set up their own Office 365 subscriptions directly. Even if you do centrally manage SharePoint, you may find that if yours is a reasonably large organization, you end up deploying several production farms, enabling you to provide different SLAs and configurations for various SharePoint offerings. For example, you might consider putting your intranet and document management system on one farm, your custom applications on a second farm, and your collaboration sites in the cloud.
It’s important to note that if you don’t plan for your deployment model to mature and evolve over time, you may find that you have multiple farms that spring up for dedicated business needs. This is where the deployment model chooses you, rather than the other way around. The last thing you probably need is for SharePoint to grow beyond your control—since you might be the one who has to pull it all back together later.
The SharePoint Health Analyzer, included in SharePoint 2013 as an integrated health analysis tool, is your friend. It will help you identify a lot of potential farm issues quickly. The Health Analyzer enables you to check for potential configuration, performance, and usage problems. The tool works by running predefined health rules against all of the servers in your SharePoint farm. A health rule (defined either by Microsoft or by you) runs a test and returns a status of whether the rule failed or not. When a rule fails, SharePoint Health Analyzer creates an alert on the “Review problems and solutions” page and writes the status to the Windows event log. Don’t ignore these rules, since they could indicate early symptoms of a major problem later. In some cases, the rules provide a false positive, at which point you might need to disable or adjust a rule. This is OK, provided you know what you’re doing!
You might read this subheading and think, “Network connectivity? That’s not my problem.” You might not consider monitoring network connectivity as a SharePoint operational role (and hence not an element of SharePoint operational governance), but it is critical to a healthy SharePoint farm. It’s a leading cause of issues with SharePoint, since if the servers cannot talk to each other reliably, there will be a noticeable degradation in performance. In fact, that’s the typical problem: it’s not zero connectivity (which is obvious), but a slow environment that’s not optimized. You should regularly check network performance and response times to and from SharePoint servers and SQL servers, AD servers, and users. You can test these items with standard monitoring tools such as Microsoft Systems Center Operations Manager (SCOM), or you can download a free script at www.jornata.com/essentialsharepoint. It’s even more important—not less—to test connectivity when using Office 365, since the Internet connection coming into your organization (which is probably competing with YouTube, Facebook, and Lync calls) is the single point of failure for SharePoint performance.
Another major reason that SharePoint environments suddenly fail is lack of sufficient disk space. To ensure uptime, monitor disk space on a daily basis. This can be performed either by manual inspection or by enabling a monitoring tool such as SCOM. There are two primary locations that you should closely monitor: SQL database capacity and local file capacity. SQL database capacity is mostly impacted by growing data files and transaction files. Local file capacity observation is really only impacted by logging and search index partition files.
Yet another reason that SharePoint environments suddenly fail is having a set of unstable application pools. Application pools run on the server and respond to requests for sites and other functionality. Two factors are pertinent to application pools: the application pool identity and the Web applications that the pool manages. Consider the account used from a security perspective, as this account must remain secure. You must also consider the Web applications associated from a process isolation perspective. Monitor the resources used by the application pool. If your application pool fails, so will all the Web applications that it manages. The primary cause of application pool failure is an expired application pool service account password. Make sure this service account is configured so that the password does not expire.
If you are enabling password change management, carefully choose the accounts for which you do so. If the account is used in applications that do not understand or respect managed accounts, those applications will break after the password has been changed. In addition, ensure that service accounts used by SharePoint have passwords that do not automatically expire. Expiration of passwords is the number-two reason why SharePoint goes down.
Because databases are at the core of the SharePoint farm, they need proper care. The SQL Server piece of the SharePoint puzzle often has inadequate governance—it’s either completely ignored, with the assumption that there’s nothing to do, or it’s overmanaged, typically by overzealous database administrators. For the most part, SharePoint takes care of configuration items such as permissions and roles. And databases and tables shouldn’t be messed with, since SharePoint will manage most other settings, too. However, there are a few items that should be checked on a regular basis: whitespace, fragmentation, and corruption. For details, you can consult the SharePoint TechNet article at http://technet.microsoft.com/en-us/library/cc262731.aspx. In addition, content databases should still be kept below 200GB unless there is a valid reason to go above that number (a read-only archive database, for example, which can grow to 1TB or more).
Finally, it’s important to proactively monitor the health of your SharePoint environment. You’ll mainly care about entries that appear in both the Windows application log and the SharePoint Unified Logging Service (ULS) logs. Being familiar with the diagnostics logs can save you valuable troubleshooting time. In addition, a tool like SCOM can monitor the event logs on all SharePoint servers in the farm, relying on preconfigured management packs that help you diagnose issues and recommend solutions. For some organizations that don’t have dedicated SharePoint administrators but do have an on-premises environment, a program like Jornata’s CoPilot for SharePoint infrastructure can help proactively address issues before they arise.
Note
If you haven’t done so already, download the ULS Viewer tool, available on CodePlex, to all your SharePoint servers to assist in troubleshooting Web and service applications.
The key to effective operational governance for SharePoint is to have a maintenance schedule for important tasks that should be done on a regular basis. Going back to our car-and-roadway analogy: Think about what would happen to roads and bridges if we didn’t repair them regularly. And what about your car? Most cars need an oil change every three months or so. In fact, your owner’s manual (you know, that thing in your glove compartment that you’ve never looked at) has a suggested maintenance schedule for your vehicle. We’ve come up with one for SharePoint, which can be performed manually by a SharePoint administrator or automated with PowerShell. Even if you automate it, an administrator should review the results on the regular schedule outlined here:
Daily
Ping all servers in all environments (dev/test/prod):
Load balancers
Web Front Ends (WFEs)
Application servers
SQL servers
Check backups to confirm that they completed the night before.
Check available disk space on all servers.
Weekly
Check Windows event logs for errors; investigate as needed.
Check security logs.
Perform an IIS reset on all SharePoint servers (WFEs and application servers).
Review content databases for size (>50GB for site collections or >200GB for total database size).
Check search crawl logs for errors; investigate as needed.
Check User Profile Synchronization Service application synchronization status.
Review SharePoint Health Analyzer errors; address as needed.
Generate a report of newly created team sites.
Review timer jobs for failures.
Review services on the servers; ensure that all services have started properly.
Check ULS logs for errors; investigate as needed.
Ping SQL from the WFE (verify intrafarm communication).
Check performance counters if issues have been reported during the week (memory, CPU, disk).
Check if application pools are started.
Generate a weekly report with
Usage data for the week (hits, unique users, total site collections, etc.)
Disk usage information, memory and CPU usage, uptime and availability, database sizes, top incidents and resolution
Summary of configuration changes to the environment
Monthly
Review the environment for abandoned sites and orphaned content.
Check SQL logs for problems.
Do a SQL database consistency check.
Verify backups by attempting a restore.
Review and report on the SLA for the previous month.
Do capacity planning based on current growth trends.
Evaluate and apply hot fixes, service packs, update rollups, and security updates as needed.
Perform a disaster recovery test.
Quarterly
Review SharePoint Server documentation to ensure that it is still current based on changes introduced in CUs, service packs, server additions, and so on; update it if necessary.
Review existing procedures, including backup, disaster recovery, maintenance, and so on, to ensure that they account for changes introduced in CUs, service packs, or other changes to the system.
So what do we mean by application governance? In short, it defines roles and responsibilities, along with policies and procedures, for building, deploying, and managing business solutions that will run on an organization’s central SharePoint environment.
Application governance provides a roadmap and framework for how the organizational lines of business will qualify, build, and deploy a SharePoint solution. It also provides guidelines for how those applications should be constructed and validated before being handed off to the operations team for deployment. Finally, it covers how and when users can use SharePoint apps, whether available from an online store or provided directly by the organization. So if you’re doing any kind of customization to your SharePoint environment, including custom master pages, Web Parts, SharePoint apps, SharePoint Designer changes, sandboxed solutions, third-party solutions, or anything else that changes the way SharePoint works, you’ll need an application governance plan.
In general, there are three major classifications of SharePoint solutions: out-of-the-box customizations, declarative solutions, and custom-coded solutions. Your application governance plan should account for all three.
Out-of-the-box customizations:
SharePoint provides a very flexible way to create simple solutions directly through the Web interface.
Each Line of Business (LOB) may develop its solutions freely.
Declarative solutions:
These are typically solutions built by using SharePoint Designer.
Once authorization for SharePoint Designer is obtained, the LOB may develop its solution in a development environment. The solution is packaged and submitted for deployment.
Custom-coded solutions:
These are built in C# via a .NET assembly deployed as a SharePoint solution package (.wsp) or SharePoint app.
These solutions need a design review.
Much as when planning the operational governance side of your SharePoint deployment, you have a number of choices when it comes to application management (see Figure 5-2). Will you strictly manage any and all development, whereby customizations must adhere to strict rules? Or will you loosely manage development, letting application development teams build anything they want, including full-trust solutions and enabling the use of SharePoint Designer within sites?
At the very least, your governance committee should consider the following questions before you deploy your SharePoint environment (or before your next upgrade):
What customization policies do we adhere to as an organization? What is the “right balance” for us between flexible and open versus strict and cautious?
Is a solution right for our central farm or should we suggest a stand-alone farm?
When do we suggest that a business unit select a cloud-based SharePoint offering?
Do we use life cycle management? For which applications?
Do we have a checklist for which applications to let in?
How do we know what SharePoint features/services we need to support business solutions?
How do we evaluate third-party solutions, including SharePoint apps, for quality and/or appropriateness?
Whom do we allow to create customizations? How are those users or developers or administrators trained, supervised, and supported?
When do we deploy updates to the farm? Who does this?
How do we regression-test custom solutions when a service pack or CU is released?
Your customization policy will determine which types of customizations will be allowed or disallowed, and how you will manage those customizations over time. In addition to answering the previous questions, your policy should include
Processes for analyzing customizations
Processes for piloting and testing customizations
Guidelines for packaging and deploying customizations
Guidelines for updating customizations
Approved tools for development
Roles and responsibilities for who will provide ongoing code support
The new SharePoint apps model, which provides a new way to deliver information or functionality to a SharePoint site, requires special governance considerations. Before you allow site owners to install apps in a SharePoint environment, you must plan how you’ll support them.
Some key decision points regarding your overall apps policy include the following:
Whether to allow site owners to be able to install and use SharePoint apps. If you decide to disable the use of apps, you’ll need to prevent site owners from downloading them. Since the error that users get when trying to install an app is not intuitive, you’ll need to communicate this policy to users and site owners. In addition, make this decision clear in your customization policy.
Which specific SharePoint apps can site owners install and use. To restrict the list, you’ll need to set up an app catalog to provide a set of apps for SharePoint that site owners can install and use, or use the app request feature to control the purchasing and licensing of apps for SharePoint. You can also restrict in which environments apps will run.
Who can purchase SharePoint apps. You should create a request process that requires site owners to submit a request that your organization reviews to make sure that appropriate persons make purchases.
Who can install SharePoint apps. Users must have the Manage Web Site and Create Sub-sites permissions to install an app for SharePoint. By default, these permissions are available only to users who have the Full Control permission level or who are in the Site Owners group.
Which apps should be monitored. You can determine which specific apps should be monitored within a farm.
How you’ll control licensing, especially if you allow site owners to download and install SharePoint apps from the Internet. SharePoint 2013 does not enforce app licenses. The Office Store handles payments for the licenses, issues the correct licenses, and provides the process to verify license integrity. Note that licensing works only for apps that are distributed through the Office Store. Apps that you purchase from another source and apps that you develop internally must implement their own licensing mechanisms.
For each type of solution, you’ll want to have a checklist in place. Here’s a suggested list of items for each type.
Out-of-the-box customizations:
Encourage teams to create simple solutions directly through the Web interface.
Provide guidance to power users regarding the use of various tools such as Excel, Access, and Visio (since you might not support Excel Services, Access Services, or Visio Services).
Educate power users about the 5,000-item governor within lists; lists will support millions of items when partitioned and indexed properly, but views should always be used to limit the query for viewing purposes.
Declarative solutions:
Decide which teams can make use of SharePoint Designer; enable the tool only for certified users.
Once authorization for SharePoint Designer is obtained, the LOB may develop its solution only in a development environment; the solution must be packaged and submitted for deployment using standard means.
Custom-coded solutions:
Since these solutions are typically built in C# via a .NET assembly deployed as a SharePoint solution package (.wsp), they’ll need extra scrutiny.
These solutions need a design review; see Table 5-1 for a suggested checklist.
If you plan to do any customizations outside of simple Web-based modifications to sites and lists, you’ll need to establish at least one other environment in addition to your production farm.
For example, it’s helpful to have a development integration environment for testing customizations; this environment should mirror production in terms of configuration. All new functionality, whether it be native to SharePoint, custom-developed, or third-party—should be first tested in the development environment.
It’s also useful to have a staging environment, which should be updated with production content on a regular basis. Staging is typically done so that users can perform quality assurance testing with content that matches production content.
The number of non-production SharePoint environments really comes down to three factors: your tolerance for risk, how many and what kinds of changes you make to your environment, and your ability to adequately maintain the extra farms.
The key points from this chapter are:
Remember that managing risk is your key goal. Put policies and procedures in place to ensure uptime of the environment and deployability and stability of custom solutions.
Establish an operational governance plan to ensure system uptime and to ensure that IT understands its roles and responsibilities with respect to SharePoint.
Establish an application governance plan (also known as a customization policy) to ensure quality and relevance of custom applications built on top of SharePoint and to ensure that all users understand their roles and responsibilities.
Remember that governance is really about both assurance and guidance—but it takes commitment to ensure that your governance plan is followed.
Maintaining a central SharePoint farm is no small task; keep in mind that it’s much more involved than other services like e-mail or instant messaging.
The types of solutions you allow are critical—make sure you think through which solutions you want to allow, considering the trade-off between allowing lots of customization and keeping the server environment stable.
Take the time to establish a methodology—it could closely align with your existing development life cycle, or it could be SharePoint-specific.
Make sure that your governance plan is included in all of your SharePoint IT and developer training. You will be most successful if your IT team never learns how to do a task that doesn’t follow your guidelines.
Keep in mind that an effective governance plan doesn’t have to constrain every move—it has to provide guidance to the team to ensure that your solution remains effective and vibrant over time.