Chapter 3: Architecture and Capacity Planning

What’s In This Chapter?

  • SharePoint products and licenses
  • Critical non-SharePoint servers
  • Hardware specifications
  • Tools for controlling your deployment

SharePoint 2010 has greatly expanded its functionality from previous versions. New features include the following:

  • Office web applications, where you can display and edit Office documents in the browser
  • The Fluent UI, aka the Ribbon
  • An enhanced social experience, including tagging and notes
  • More flexibility regarding how web applications consume services through service applications
  • A full-scale business intelligence offering through tools such as Performance Point Services, Reporting Services, and Excel Services

Those and about a million (not an exact number) other new features in SharePoint 2010 are reasons why people are so excited about the product. Of course, all of that new functionality means that users will deploy SharePoint for more tasks than ever before — and that increased traffic leads to more demands from a hardware perspective. As a result, administrators can anticipate a strong increase in the number and size of servers in their farms. In short, it is expected that the same user from SharePoint 2007 will come to a SharePoint 2010 farm with more requests per second (RPS).

To help you scale, the Shared Services Provider (SSP) from 2007 has gone the way of the Dodo bird. In its place, Microsoft has introduced a new services architecture that is infinitely more configurable, which also means it is infinitely more complicated.

This chapter begins with a primer on the different versions of SharePoint 2010 you can expect to see, including a brief overview of SharePoint in the cloud. Then, because a SharePoint DVD isn’t good for much more than a coaster until you have the supporting servers on which to install it, you will learn all of the software requirements you are expected to bring to the table.

Armed with the necessary software knowledge, the discussion will turn to hardware, including the amount of metal you will need, ways you might consider scaling up versus out, and where virtualization might come into play. Finally, the chapter describes some of the clever new tools available that will help you keep SharePoint in check in your environment.

While this might not be as exciting as a blockbuster Hollywood movie, it will turn you into what every good summer blockbuster needs: a superhero. Therefore, read carefully and make sure you picked up your cape from the dry cleaners.

What’s in a Name?

As it turns out, in the Microsoft world there is quite a bit to a name. If you have any past SharePoint experience, you probably know the names of the previous products, Windows SharePoint Service 3.0 (WSS) and Microsoft Office SharePoint Server 2007 (MOSS). And given the success of these two products, it is pretty reasonable to assume their names would continue. This time around, the names are SharePoint Foundation and SharePoint Server 2010.

The key change here to note is the removal of Windows and Office from the product names. This speaks volumes about the success of SharePoint. The product is now considered to stand on its own and is no longer part of the Windows or Office groups. Being decoupled from these two groups will open the door for SharePoint to be more agile going forward and will be nothing but good news for everyone in the SharePoint fan club.

SharePoint Foundation

SharePoint Foundation will continue the legacy that WSS has established over the years and should keep that friendly price point that makes it so attractive. The name SharePoint Foundation is a perfect match for what the program brings to the table. As an administrator, it is easy to think of the product only in terms of the features you readily see in the browser — things like creating team sites and collaborating on content within lists and libraries, or features such as blogs, wikis, RSS feeds, alerts, and easy browser-based customizations.

Yet underneath all of that great functionality is where some of the true power of SharePoint is hidden. Here, the foundation provides developers with a great platform to build from. Out of the box, it handles storage, web presentation, authorization, user management, and has an interface into the Windows Workflow Foundation — and because all of this functionality is easily accessible through the object model, APIs, and web services, it can greatly accelerate a developer’s job. Rather than build all of those infrastructure pieces for every web-based product, developers can leverage SharePoint Foundation and concentrate on just building the solution.

SharePoint Server 2010

SharePoint Server 2010 is considered the premium product. It offers additional collaboration capabilities and extends the scenarios beyond Foundation. Through its tools, it enables better aggregation and displaying of content, which makes building things such as portals much simpler. It also introduces additional web content management tools that enable developers to use Server as a platform for building Internet-facing websites.

This is all done by extending the capabilities introduced by SharePoint Foundation. Any time you install SharePoint Server, the Foundation product is installed automatically as well. Keep this in mind as you manage your environments, making sure you keep current on both Foundation and Server issues, as you really have both products. When you are doing tasks like applying service packs or adding third-party applications, this knowledge might just change how you go about things.

Standard and Enterprise

As in the past, SharePoint Server 2010 is available primarily in two flavors, either Standard or Enterprise. Standard introduces core functionality like social, search, and advanced web and enterprise content management. Enterprise focuses primarily on adding functionality through new service applications, introducing business intelligence, line of business integration, reporting, and some Office client services such as Visio.

This is all provided through a client access license (CAL) model. You only need to run one setup program, which puts all of the binaries on the server; and based on the license key you enter, either the Standard features will be available or both the Standard and Enterprise features. You are required to have an appropriate CAL for each user accessing SharePoint Server 2010.

note.ai

SharePoint licensing is a notorious black hole and cannot be covered completely in this book, as each scenario is unique and should be treated as such. The information provided here is guidance to help you understand the concepts in order to make more informed decisions and to understand the platform as a whole, but it should not be considered the final word on licensing. Please consult your license reseller or Microsoft Licensing for specifics before making any purchase.

For Internet Sites

Because of the tremendous popularity of SharePoint on the intranet, a new trend has been for companies to also host their Internet site on SharePoint. From a business perspective this can greatly reduce the cost through the reuse of a familiar tool. By not having two separate platforms to create “websites,” it is no longer necessary to maintain two separate yet similar skill sets. This also reduces licensing and hardware costs because some systems, such as development environments, can be used for both internal and external projects. This is usually not possible if you have two separate products in place. For most companies, reducing training, licensing, development, and administrative costs sounds like just the answer bean counters are looking for.

While MOSS had an Internet license, it was an all-or-nothing model. It included all of the MOSS Enterprise features, which typically meant a hefty price tag. However, many companies only wanted the core web content management (WCM) functionality, not those extra features and their associated cost. Microsoft heard these pleas loudly and clearly, and addressed the issue in SharePoint Server 2010.

There is now a SharePoint Server 2010 for Internet Sites Standard license as well as an Enterprise license. You will often hear this license referred to by its acronym, FIS, regardless of what version they truly mean. The goal of introducing the Standard license is to offer smaller sites the capability to deploy on Server for their web presence. It is then throttled to permit only a set amount of traffic, and will not have the Enterprise features available. However, it should be just what the doctor ordered when it comes to getting a simple WWW site up and off the ground.

One very important thing to understand about FIS is where it is applicable. Remember that everyone who will access the SharePoint Server site needs a CAL. When building an intranet portal, it is easy to count how many employees you have and to purchase a CAL for each one of them; but when you stand up http://www.company.com and make it available to the world, now how many CALs do you need? There are roughly 1.8 billion people on the Internet, and potentially every one of them can come to your website. That’s a lot of CALs to buy. Luckily, this is where FIS comes into play. It allows unlimited non-employee access to your SharePoint Server. The reason why non-employee is emphasized is because this license does not cover any of your employees and there has been a lot of confusion over the license in the past. The proper FIS license can help you control the cost of your SharePoint Server deployment, but great care should be taken to use it properly.

Search Server 2010

For most users, one of the major downsides of SharePoint Foundation is its lack of a powerful search feature. Foundation can provide search results of the SharePoint content from within a given site collection, but that is all. It cannot pull search results from multiple site collections, and cannot add external content sources such as file shares or Exchange public folders.

In addition, many users have become very reliant on the use of a search engine in their daily lives. Bing and Google are typical first stops for most users as they explore the Internet for everything from buying a new car to figuring out why their in-laws give them a headache. For most of these users, it is only logical that they should be able to discover information at work in the same manner. SharePoint Server 2010 can provide this full-scale enterprise search presence, but not everyone can afford to deploy it. Other users look to some type of appliance to index and search the intranet, but these devices can be difficult to administer and even more cost prohibitive. Enter Search Server 2010 Express and Search Server 2010.

Search Server 2010 Express (SSX) is a free product from Microsoft that essentially takes SharePoint Foundation and adds to it the intranet searching capabilities. You gain the capability to add content sources such as file shares, other SharePoint sites, websites, and Exchange public folders. You also get the Search Center template and all of the associated Web Parts. To avoid stealing all of the thunder from Chapter 14, this simple feature list will have to suffice for now; but if any of that interests you, consider deploying SSX instead of Foundation, or if you have Foundation already deployed, you can simply upgrade Foundation to SSX.

The only real shortcoming of SSX is that it cannot be configured to be high availability. While Search Server 2010 can be configured to avoid any single point of failure, including the Search components, SSX does not offer that capability. It can only be deployed to one server in the farm; there is no infrastructure for deploying it on multiple servers to create redundancy. If you need a high-availability solution, you need to move up to full Search Server. The only downside to the full version is that it is a for-purchase product.

Fast Search Server 2010

In early 2008, Microsoft purchased Fast Search and Transfer. Fast was considered the “best in breed” high-end search tool set. In the 2010 product line, it has been incorporated as an addition to the SharePoint Server platform, adding lots of new functionality, including the following examples:

  • Visual search and best bets
  • Extreme scale, with a billion documents possible
  • Enhanced multiple language capabilities
  • Better handling of unstructured data through metadata extraction
  • Better handling of structured data such as numbers, dates, etc.

Adding all of this to SharePoint’s already very powerful search engine is a huge win. There will be two different licenses for this feature: Fast Search Server 2010 for SharePoint will be used along with the Server Enterprise CAL in intranet deployments, whereas Fast Search Server 2010 for Internet Sites will be used in conjunction with FIS for public websites.

These two Fast products have been completely integrated into the SharePoint platform, providing an easy to use management interface and Windows PowerShell cmdlets. From a development perspective, they are also plugged into the object model (OM) the same way normal SharePoint Server Search is. Therefore, users don’t have to learn anything new in order to get results returned. Instead, they make the same calls to the Search OM, and Fast returns information in place of the normal search engine.

There is a third license worth mentioning even though it is outside the scope of this book — Fast Search Server 2010 for Internet Business. This license will not be used in conjunction with SharePoint, but instead for custom public websites. For example BestBuy.com uses Fast to help consumers find all those gadgets and gizmos they love to buy. Unlike SharePoint, which is usable as is, this product is more a set of tools that need to be assembled to provide an amazing search experience.

SharePoint Online

Another push for SharePoint from Microsoft will be SharePoint in the cloud, hosted by Microsoft. If you are looking to deploy SharePoint using this model, then you can probably stop reading the book at the end of this section because in SharePoint Online, the entire server infrastructure is hosted and maintained for you. This model removes the administrative overhead of SharePoint and lets the business focus just on using the power that is SharePoint. (While this might be great for the business, it does eliminate the need for a SharePoint administrator, so many of us will consider this license option the enemy.)

There are actually two models to consider with SharePoint Online: shared and dedicated. The shared model provides you with a slice of a shared farm and enables you to use SharePoint out of the box. Server-deployed code and customizations are not permitted. The dedicated model enables you to run your own farm, and you are allowed to make approved customizations to the server. Any change must be packaged in a solution package and validated by Microsoft before being deployed to the server. All licenses are bought per user.

This offering has also been expanded from previous versions to add some new licensing options. One of these is the concept of the “deskless worker.” These are users you can add at a lower price point, and they have mostly read-only access to SharePoint. There are also models available that can support hosting a partner collaboration site and public-facing Internet sites.

note.ai

This chapter covers SharePoint Online because it is an additional available SKU in this product cycle, but you have other notable options if you are looking at a hosted scenario. Companies such as RackSpace.com and Fpweb.net offer hosted SharePoint environments that you may find a little more flexible than SharePoint Online. Of course, hosting internally is still the best option, but it is good to know your enemies.

Other Servers

So you signed up to be a SharePoint administrator? Well, congratulations, you are also now responsible for a whole host of software. Since SharePoint isn’t an operating system (yet!), you need to have the right operating system in place in order to deploy SharePoint. Additionally, SharePoint stores 99% of the content and configuration in a database, so SQL Server has to enter the conversation sooner, rather than later. Also, most deployments want to take advantage of SharePoint’s ability to send notification e-mails, and some even take advantage of its ability to receive e-mails. Even though you may not directly be responsible for these products, they will affect your livelihood. Users don’t call to complain that SQL Server isn’t working; they call to complain that they cannot access SharePoint. It is your job to determine that it is because SQL Server is not responding. This section covers the ins and outs of these various pieces of this puzzle.

Windows Server

SharePoint is available only as 64-bit software, so by extension it can only be installed on servers with 64 bits or more. And don’t bother looking, there is no 32-bit “test” version hiding out there. The authors have looked under every rock on the Internet and inside Microsoft; and like unicorns, it doesn’t exist.

For production deployments, you will be installing on either Windows Server 2008 SP2 and later or Windows Server 2008 R2 and later. The following editions of Windows Server are supported:

  • Standard
  • Enterprise
  • Datacenter

Noticeably absent to some is the Server Core installation of Windows. Unfortunately, it does not allow all of the necessary components that are required for SharePoint to operate to be installed, so SharePoint will not install on Core. Also, the Web Edition is not supported, which is probably a good thing — thanks to its limited memory capacity, it would not perform very well.

Required Additional Software

After you have Windows installed, the server needs to be included as a member of an Active Directory (AD) domain. SharePoint does not support local machine accounts for any type of farm deployment, and the configuration wizard will error out if you try to use a local account.

Most administrators realize that something like IIS needs to be installed on the Windows Server in order for SharePoint to render web pages. They are often tempted to install this manually, which is safe to do but probably a waste of their time. The server also has roughly a dozen other prerequisite software packages that need to be installed, including the Web Server (IIS) Role. Thankfully, there is a SharePoint Products and Technologies Preparation Tool that will install and configure all of these for you when the time comes. That tool, and all of its intricate details, is covered in the next chapter.

But I already did “X” to my server!

If you already installed IIS, PowerShell, or one of the other prerequisites, don’t worry; all is not lost. The prerequisite installer tool will check up on you. If you did successfully install and configure one of the requirements, the tool will skip it and move on to the next one. In the case of IIS, if you enabled the role but didn’t configure it the way SharePoint needed, the prerequisite installer will just make the necessary changes. So keep that chin up; all is good.

Another common mistake to avoid is adding the server to a domain, or even promoting it to a domain controller (typically only done on a test virtual machine) after adding programs to the server. Programs such as IIS and SQL Server don’t always take too well to these changes. Make any computer name changes (which adding to a domain does) as soon after installing Windows as possible. Then you can safely continue with getting it ready for SharePoint.

Windows Vista and 7

In order to appease your friend the developer, Microsoft has introduced the capability to install SharePoint using a standalone install, for development purposes, on certain versions of Windows Vista x64 and Windows 7 x64. These editions are as follows:

  • Windows Vista SP1 and later.
    • Business edition
    • Enterprise edition
    • Ultimate edition
  • Windows 7 RTM and later.
    • Professional edition
    • Enterprise edition
    • Ultimate edition
  • The N and KN editions of the preceding software will also work.

It is absolutely not supported to use a Windows Vista or 7 installation for a production farm. They should only be used for developers who wish to do SharePoint development locally on their own machine. If development is done in these environments, then it is highly recommended that developers have a test environment to validate their solution before deploying to production. These types of deployments are a little more tedious in the initial configuration and are discussed more in the next chapter.

SQL Server

Get used to it: SQL Server just became your best friend. Because everything inside of SharePoint, including all of your content, lives inside a SQL Server database, as SQL goes so does your farm. For example, do you know what the most common performance bottle neck is in SharePoint? SQL Server. Therefore, in order for you to be good at your job, at a minimum you need to understand what is going on in SQL Server. Ideally, you will start sucking up to your resident database administrator (DBA) to ensure that your SharePoint databases are well cared for.

As with the Windows Server requirement, SharePoint also requires SQL Server to be 64-bit. 32-bit SQL Server is not supported. The 64-bit editions of SQL Server that are supported are SQL Server 2005, SQL Server 2008, and SQL Server 2008 R2. SQL Server 2005 requires Service Pack 3 plus cumulative update package 3 for SQL Server 2005 Service Pack 3 (KB967909). SQL Server 2008 requires Service Pack 1 plus cumulative update package 2 for SQL Server 2008 with Service Pack 1 (KB970315). SQL Server 2008 R2 will be supported at its RTM build or later.

E-mail Servers

SharePoint comes with a handy piece of functionality that enables it to send e-mails. This is often used to notify users that they have been granted access to a particular site. Users can also subscribe to an alert whereby they are notified when items are modified on a particular list or library. And with a little extra work, SharePoint workflows can be configured to e-mail users as necessary.

In order for SharePoint to send these e-mails, it needs to be configured with an outbound e-mail server. The SMTP server you point SharePoint at needs to allow anonymous relay from SharePoint. Unfortunately, SharePoint cannot be configured to provide authentication information when sending e-mails. In most environments, anonymous relay is not permitted, because for years evil spammers have used anonymous relays to avoid detection as they flood you with offers for low-cost medicines and opportunities to invest in dubious banks. In this case, you can ask the e-mail administrator to add the IP addresses of all SharePoint servers to the list of servers that are allowed to anonymously relay mail. If this is not acceptable, then your second option is to install the SMTP service on one of your SharePoint servers and then configure it as necessary. You will need to ensure that it can correctly send outbound e-mail and that it allows all anonymous relay from all the SharePoint servers in the farm.

Another requirement for outgoing e-mail is that port 25, the default SMTP port, is not blocked between your servers. Such a blockage can happen at the firewall level or at the local server level. Some antivirus vendors configure their software to block port 25 outbound on all machines. This will stop SharePoint from sending e-mail, so be on the lookout.

Incoming E-mail

A lesser-known feature of SharePoint is its ability to receive incoming e-mail and then route that e-mail to the appropriate list or library based on the To: address. This enables scenarios such as having salespeople in the field e-mail in their expense report to a special e-mail address. That e-mail would be routed to the SharePoint server and then the attachment could be extracted and uploaded to the appropriate document library. From there, whatever business process needs to take place could be invoked. A simpler scenario might be setting up an e-mail address for a discussion forum. Then, any time you send an e-mail to that address, the e-mail becomes a discussion item in the list. Once in SharePoint, it is easily indexed so it can be discovered later; and because it is now a normal list item, the discussion can continue.

Configuring this functionality requires the help of the e-mail administrator, and it is worth noting that it does not require the use of Exchange. This is a multi-step, complex process that touches several pieces, but the core steps are as follows:

1. Install and configure one of your SharePoint servers to run the SMTP service. This server will then need to be set up to accept e-mail for the domain you define for SharePoint. Typically, it would be something like @sharepoint.company.com.

2. Configure your corporate e-mail server to route mail for the @sharepoint.company.com domain. The idea is that when your corporate e-mail server receives that e-mail, it just passes it over to the SharePoint server.

3. Go to SharePoint Central Administration and enable incoming e-mail. You will need to tell SharePoint that it is looking for e-mails in the @sharepoint.company.com domain.

4. Now someone with the manage list permission level can go into his or her list and associate an e-mail address with the list — for example, [email protected].

5. This associated e-mail address would now need to be configured as a valid contact on the e-mail server.

With this configured, e-mails will be sent to [email protected]. Your corporate e-mail server will relay that mail to the SMTP service running on the SharePoint server. The SMTP service will then take that e-mail and put it in a maildrop folder. The SharePoint timer service checks that folder once a minute by default, looking for e-mail. When it finds an e-mail, it routes it to the appropriate list or library based on the address.

While that is a simple scenario, many configuration options are available. You can, for example, configure Exchange Server and Active Directory to allow users to create their own e-mail addresses. This is done through the creation of an additional Organization Unit in your domain. This is a more complex scenario, but it eliminates the administrative burden of having to set up e-mail contacts each time a new list or library requires mail functionality.

You can find detailed configuration information, with multiple scenarios and troubleshooting steps, on TechNet (http://technet.microsoft.com/en-us/library/cc262947(office.14).aspx).

Text Message (SMS) Service Settings

That is right — SharePoint has become so cool that it can even send text messages. And since SharePoint still isn’t old enough to drive, you don’t even have to worry about it texting and driving. Once the service is configured, users can choose to have alerts sent to e-mail or text message or to both.

The service is pretty straightforward to set up from within Central Administration and can be scoped at either the farm or the web application level. You will need to provide the URL of an SMS sending service. If you don’t have one handy, you can click the link on the Mobile Account Settings page in Central Administration to find one based on your preferred wireless provider. Just watch out for this functionality: It can easily become a runaway cost.

Hardware Requirements

Build it and they will come. Underpower it and they will complain. (No user has ever complained that SharePoint is too fast.) Of course, with budgets being very tight, you will feel the pressure to keep hardware costs as low as possible. This tension between functionality and cost creates a fine line to walk.

Perhaps the easiest way to start thinking about hardware is to do a comparison of the minimum recommended requirements from MOSS 2007 and the minimums for SharePoint Server 2010 (see Table 3-1).

Table 3-1 MOSS 2007 versus Server 2010 Recommended Minimum Hardware Requirements

MOSS 2007 Server 2010
Processor 2 core / 3 GHz 4 core / 2.5 GHz
RAM 2GB 8GB

Note that part of this discrepancy is that Microsoft has done a better job of setting the minimum bar this time. Despite these recommendations, it is not practical today to run a MOSS 2007 server with less than 4GB of RAM. Even taking that into account, it is safe to assume that SharePoint 2010 will require at least twice as much hardware as an existing SharePoint 2007 farm. This is assuming properly sized 2007 hardware today. Experience has shown that SharePoint farms tend to range from vastly undersized desktop-class machines running thousands of users, slowly, to supercomputer-class machines that on their best day use 20 percent of their resources to serve 100 users. So if you are going to make hardware assumptions at least in part based on your 2007 environment, make sure you understand how that hardware is utilized today. The next few pages describe the different server types and how the hardware considerations vary for each.

Web Servers

Often referred to as web front-end (WFE) servers, these are the machines ultimately responsible for the rendering of the SharePoint pages. They typically do not have a high CPU load because they attempt to cache as much content as possible to avoid doing the same work over and over. To do caching properly, the server does consume quite a bit of RAM, so be sure to dedicate a substantial portion of your spending on this server to RAM.

A key consideration when determining how much memory you might need is the number of application pools you plan to have. In a nutshell, application pools are the various IIS processes that listen for incoming web traffic and then handle it accordingly. In Task Manager, you will see each application pool as w3wp.exe. For example, when you create a new SharePoint web application and choose a new application pool, you get a new instance of this process running. Now when you access SharePoint, this process is actually receiving your request and coordinating with SharePoint to render your page. When SharePoint is caching content in memory, it is being stored in RAM associated with this process.

Part of this consideration, though, is that every application pool has a certain amount of overhead associated with it, the process, and the memory it needs to do its job. Therefore, for each new application pool you create, your RAM requirements will increase, so plan accordingly.

This role requires very little local storage and does not need to be optimized in any way. The only storage this machine is doing is the SharePoint root, all of the local ULS and IIS logs, and possibly some disk-based BLOB caching. In other words, don’t get carried away here and create a 10GB C: drive. SharePoint occasionally needs to have extra space for temporary files, maybe to unpack a solution or to deploy a service pack, so an 80GB or 100GB C: drive is reasonable for your WFE.

note.ai

SharePoint root refers to a folder structure: C:program filescommon filesMicrosoft sharedweb server extensions14. In SharePoint v3, the 12 folder was called the 12 Hive, so you may hear some people refer to the SharePoint root as the “14 Hive.” If you do, try not to make fun of them.

Application Servers

Application server is the generic name for servers that are responsible for providing resources for the various service applications. The tricky part of sizing these boxes is that each service application has a different usage profile, so the requirements will vary depending on what is running on the box and how heavily that functionality is being used. In addition, when building out an application tier, you should consider scaling out versus scaling up. In other words, is it better to have one large application server with a lot of resources but a single point of failure, or several smaller boxes running the same services that provide fault tolerance but require more administration? The following sections describe some of the key types of application servers and their individual considerations.

Query Server

A query server is the server responsible for responding to user search requests. When a user opens a SharePoint page, types into the search box, and presses Search, that request is routed by the WFE to the query server. The query server processes the request and then forwards the information back to the WFE for security trimming and rendering of the results.

This server uses CPU and memory to process the request and will try to cache as much of the index as possible within RAM. This role is also unique in that it requires local storage on the machine. The query file is kept on local disks and processed on the server. In environments with large indexes (one million plus items) and high search demand, it is best to optimize the storage of this file for fast retrieval. Conversely, in smaller environments it is not unusual to see the WFE and query server on the same machine.

The Search architecture for SharePoint 2010 is completely new compared to 2007 and is covered in full detail in Chapter 14.

Index Server

You will hear the index server also referred to as the crawl server. Unlike its predecessors, this version is stateless, which means it does not store any information locally. Therefore, your index server does not have any extra disk storage requirements. Typically, indexing of content is a processor-intensive task, so consider additional CPU capacity if you are in an environment with intensive indexing requirements. This tier is also covered extensively in Chapter 14.

Excel Services

Excel services and the other service applications that are focused on Office client tasks and compatibility features are generally more CPU heavy. This is because they typically do not have any storage and are only being used to offload processing from the clients to the server. These features generally require the business units to work with their data differently. They are often not in high demand, especially during early phases of a SharePoint rollout. Therefore, don’t overscale for this functionality until you confirm that the business adoption will increase demand.

Usage and Health Data Collection

The Usage and Health Data Collection service application might be the most data-intensive piece of SharePoint. It enables the collection of all the diagnostic and usage data from your entire SharePoint farm into one database. This database can then be used for reporting, and is even flexible enough to accommodate custom reporting. Early results have shown that in large environments, this feature creates a very large SQL load, especially on the storage side. Therefore, in order to fully utilize it, you may want to consider putting it onto its own SQL server. Check out Chapter 15 for full details, but make sure that the amount of usage of this functionality factors into your farm planning.

SQL Servers

It turns out there are entire multi-book series on this one topic and even if you have read all of them you still wouldn’t have a definitive answer about sizing your SQL Server. Therefore, as you approach sizing this particular box or boxes, don’t be afraid to ask for help. Also, keep in mind that over the years, the main bottleneck in most SharePoint farms is SQL Server performance.

The key thing to remember is that all of the standard SQL Server hardware best practices are important. SQL Server loves memory and will utilize every bit it can get its hands on, so you should plan accordingly. Eight GB of memory should be the absolute minimum you consider, and 16GB or even 32GB might be appropriate in a heavily used environment. CPUs require the same consideration; a quad-core processor might get you started, but boxes with multiple quad-core processors are more common.

Even if you buy enough CPU and RAM, you still are not out of the woods. Disk configuration has as much, if not more, to do with performance. You will need to plan for the number of spindles your SQL Server has access to and how they will be configured; and to do this properly, you need to consider the amount and shape of data you plan to store in SharePoint. You should be following the SQL best practices specifying that the data (*.mdf) and log (*.ldf) files are on different disks, and that the log files are optimized for write. When you are considering which databases to optimize first, the order is as follows:

1. Tempdb (a SQL System database)

2. Search databases

3. Content databases

While tempdb should clearly always be the first database optimized, your needs for the Search databases and content databases may vary based on your specific scenario. For example, if you have created a content database for collaboration that is excessively large (greater than 100GB), then in order to minimize locking issues you may need to move that database to optimized disks instead of the typical content database that will perform adequately on a basic RAID 5 volume. The key here is to make sure either you understand all of your SQL disk requirements before you purchase the box or you have access to a flexible solution, such as a SAN, for storing your databases.

note.ai

Check out the TechNet SQL Server TechCenter at http://technet.microsoft.com/en-us/sqlserver/default.aspx for guidance on planning and sizing a SQL Server deployment.

Finally, when it comes to SQL Server, SharePoint doesn’t really care how you set it up. As long as SQL Server is running a supported version and can serve databases back to SharePoint, it doesn’t matter whether SQL Server is dedicated to SharePoint or is shared with other applications in the company. Nor does SharePoint care whether SQL Server is clustered or doing database mirroring or even transparent encryption. SharePoint will simply call to a SQL Server instance for a database, and if it gets data back it is happy.

Mixing and Matching Servers

Now that you have an understanding of the different types of servers, you need to consider how those will be deployed onto physical hardware. As you combine them, you need to consider the hardware profile of each, and what the server will need to support the aggregate load.

One Server

This is a configuration you will typically see only for demonstration and evaluation purposes. In the example shown in Figure 3-1, all SharePoint server roles and SQL Server will be configured to run on one machine.

Two Servers

A two-server configuration is generally considered the minimum point of entry for a small SharePoint deployment. In this scenario (see Figure 3-2), all of the SharePoint services will run on one server, and SQL Server will run on a separate server.

Three Servers

By adding a second server with SharePoint installed, you create the possibility to reach a high-availability solution (see Figure 3-3). By putting some type of network load-balancing (NLB) device in front of SharePoint, you can ensure that the WFE services are fault tolerant. Make sure your NLB device is configured for persistent sessions. This is a SharePoint requirement and covered in more detail later in the chapter. Then, by configuring the service applications to run on both machines, you can avoid one server crashing and causing you to have a bad day.

Four or More Servers

This is where you start making choices. Figure 3-4 shows a scenario in which the environment has been optimized for performance and availability for the WFE and query roles, but the downside is that the application tier does not provide high availability. This is generally not a good idea, so you may want to skip straight to introducing a fifth server to the farm in order to bring high availability to the other service applications. You will not need another NLB device because service applications handle their own load balancing.

At this point, you probably get the idea that you can scale out any of the various service applications as necessary to meet your needs, which leads to our next topic: server groups.

note.ai

The Search service application architecture is the most complex and demanding of all the service applications. For many administrators, the majority of their farm architecture will be based on meeting the demand of this feature. Please see Chapter 14, which explains all of the components of the Search architecture in full detail, including additional farm topologies for optimizing the Search service application.

Server Groups

A server group refers to the logical concept of grouping similar SharePoint service applications together on the same physical hardware. This enables you to add servers, which means additional capacity, for each tier as demand increases. This also segregates the performance impact of the various service applications. Figure 3-5 shows an example.

This example isolates the web, search, business intelligence (BI), and all of the other service applications. Now, if business adoption of the BI increases beyond the current capacity, it will not affect the performance or stability of the rest of the farm. It is also simple to purchase another server, install SharePoint onto the box, and then add it to the farm. Once it is a member, you would then add it to the BI group by configuring it to only run the Excel and Performance Point services. Note that you will not see the term “server group” in Central Administration anywhere; it is only a logical concept.

Notice also in Figure 3-5 that SQL Server has been exploded out into various logical groupings. The performance characteristics and demands of different databases can vary greatly, and in large environments it can be very helpful to configure and manage each one separately.

Other Hardware Notes

Now that you have gotten the hang of all this hardware, the following sections describe a few more considerations to think through before you move on.

The Network

Network connectivity between the servers in your farm is hugely important. At a minimum, all servers in the farm should be connected through gigabit connections. The hard requirement here is that each server should be connected by a gigabit connection with less than one millisecond of latency. This precludes most companies from having a SharePoint farm with servers in multiple, geographically dispersed data centers.

For many companies, the sheer volume of network traffic generated between the members of the farm is overwhelming. In order to better control this traffic, they move all of the inter-farm traffic to a dedicated virtual local area network (VLAN). This is like the server groups discussed earlier. By grouping all of this traffic, it is easier to monitor and administer in the case of any issues. A dedicated VLAN is not a requirement for SharePoint, but in a large farm it is often recommended.

Network Load Balancers

In order to achieve high availability of the SharePoint web applications, it is necessary to introduce a tool to do network load balancing. This can be either a hardware-based tool, such as an F5 device, or an external software solution, such as ISA or TMG Server, or even something as simple as the built-in Windows Network Load Balancing (NLB) feature.

Hardware-based solutions are generally best, as they offer the most configuration options and usually the best performance, but they are also typically very expensive. The software-based solutions such as TMG provide a happy middle ground, especially if you are already using them in your environment. They include just enough options and monitoring to make the cut. Although using the Windows NLB is free because Windows provides it out of the box, it is a very rudimentary feature. It cannot do tasks like validate whether the server is serving valid pages, and instead only confirms that the server responds to ping traffic. For mission-critical scenarios this is not an ideal solution.

Regardless of which option you choose, you must configure NLB properly. You are required to set the NLB to a persistent or sticky session or single affinity, meaning when a user opens a browser and navigates to SharePoint their entire session must be against one WFE. SharePoint caches too many requests and does not share that cache across WFEs, so if users are constantly moving from server to server, it is possible for them to have erratic results.

Server Drives

When you configure SharePoint, it is generally a good idea to get everything possible off of the C: drive. For example, navigate to Central Administration and change the diagnostic logs to be hosted on the E: drive instead of the C: drive. Remember that this is a farm-wide setting, so all servers in your farm now must have an E: drive or they will get errors and stop logging. This is inconvenient but not the end of the world. The end of the world happens if you try to add a server to this farm now that doesn’t have an E: drive. You will get file I/O errors running the configuration wizard and it can take a long time to figure out the cause. That is why it is recommended that all of the SharePoint servers in your farm have standard drive letters. A simple design choice like that can greatly reduce your headaches going forward.

Virtualization

Now that you have learned about hardware considerations, the question that logically follows is, “Which servers can I virtualize?” After looking at some of those server groups, it is very easy to see some opportunities.

Typically, when it comes to virtualization, it is recommended that you start at top of the farm and work your way down. Web front ends have almost no disk requirements and generally are consuming RAM and some CPU. They virtualize very well. How well application servers virtualize depends on what service applications they are hosting. For example, if you have a server group that is only hosting things like Office Web Apps and the Managed Metadata service, they would virtualize easily — again, because they have almost no disk requirements.

Your query and index servers can be virtualized but the benefit will depend on your performance expectations. Crawling 10 or 20 thousand items once a day is a pretty light load, and will virtualize without issue. However, trying to crawl 100 million items virtualizing and getting the performance you need would be extremely difficult.

The moral of the story when it comes to virtualizing SharePoint servers is that you should first understand how hard that server is working and where its first bottlenecks occur. If you can safely virtualize without reducing performance, then do so. Also, if you are building test and development environments where performance is not a critical factor, then virtualization is the way to go.

Should you virtualize SQL Server? That question generates almost as much passionate debate as Mac versus PC. Virtualizing SQL Server and achieving acceptable performance is possible; unfortunately, your average virtualization administrator, in partnership with your average SQL Server database administrator, cannot do it well. It is too complex a configuration for someone to stumble through. As a SharePoint administrator, you already know that SQL Server is going to be the factor holding your farm’s performance down; do you really want to gamble with virtualization, which is likely to further decrease that performance?

Terminology

One of the biggest challenges for SharePoint administrators new and old is the vocabulary. SharePoint is littered with words, such as “site,” that have about a dozen different meanings (no one is ever really sure what a site actually is, and many consider it suitable that site is a four-letter word). To this end, Figure 3-6 is here to save the day. Once you can speak to the entire hierarchy from top to bottom your job is complete, you have practically conquered SharePoint — so study hard.

Starting from the top you see Farm = Configuration Database. This means that each SharePoint server can belong to only one farm. A farm can be a single server (refer to Figure 3-1) or something as complex as what is shown earlier in Figure 3-4. A farm refers to all of the servers that are using the same configuration database. When you run the configuration wizard, you choose to connect to an existing configuration database (join a farm) or create a new configuration database (create a farm). All servers in a farm, therefore, share everything, including the fact that there is only a single Central Administration site that controls all servers in the farm.

Below this, in the column on the right, you come to Services. These are the actual services on the server that run to provide functionality to the Service Applications. For example, the Excel Service Application you create in Central Administration from Manage Service Applications is a Service Application Connection point. That Service Application Connection point is the proxy to the instances of the Excel Service that is running on the server(s) in the farm. Don’t worry if that isn’t completely clear; Chapter 7 is dedicated to the inner workings of service applications.

Finally, at the bottom of the right-hand column you have Service Application Databases. Some service applications require database(s) to store information in order to work, while others do not. This is just one of the many reasons why using SharePoint 2010 means you will be getting to know SQL Server.

Web Applications, at the top of the left column, are the actual SharePoint websites you visit. Because they appear in IIS, you will hear people refer to them as sites, websites, IIS sites, and other creative things. However, it is very important that you refer to them as web applications or web apps for short, as everything in the SharePoint management interface and all of the documentation always refers to them as web applications. Examples are http://portal.company.com, http://www.company.com, https://extranet.company.com, or http://team.

Between Web Applications and Service Applications you see a double-headed arrow labeled Many to Many. This is your reminder that this is the only place on the hierarchy with a many-to-many relationship — in other words, one web application can consume multiple service applications, and service applications can also service multiple web applications. This is one of those seemingly infinite configuration options that make SharePoint so fun to architect.

Every time you create a new web application, SharePoint will automatically create a new Content Database for you and associate it with the web application. This will be the default location for storing content from the web application. It is possible for you to also create additional content databases to associate with the web application. This is done to help scale. Two unique web applications cannot share a content database.

That brings you to the most important concept in SharePoint: the Site Collection. Site collections are the unit of scale in SharePoint. The easiest way to think of a site collection is as a bag, because they are really just a boundary or container. They are not actually content users can touch. The reason why this “bag” is so important is because it determines a lot about how your information is stored.

Site collections are a storage boundary and are stored in one and only one content database. They cannot span multiple databases. When you create a site collection it is created in a database and that is where it will stay unless you manually move it. If, for example, you want to limit all of your content databases to 40GB because that is the largest size you are comfortable with, then you need to ensure that no site collection is larger than 40GB. Similarly, if you have multiple site collections (and everyone does), then you would need to apply quotas to those site collections to ensure that the sum of the site collections doesn’t exceed your 40GB database limit. For instance, if you had 10 site collections, then you would want to set your quotas to 4GB per site collection.

Site collections are the only objects in SharePoint to which you can apply a storage quota. If you want to limit a user to storing only 10GB of content in a particular document library, there is no way to do that. You would have to set that entire site collection to a 10GB limit. If you have two document libraries and you want to give each one 10GB of storage, then you have to ensure that each document library is in its own site collection.

Even if you have no intention of holding users to limits, quotas are generally recommended for all site collections, as they serve as a checkpoint and keep you from having runaway site collections. If a user calls and says that he is getting warnings or errors because he has met his quota, it is a simple process for you to increase his quota, and it gives you a chance to ask, “So what are you doing with SharePoint that you need so much storage space?” It would be good to know if he is just backing up his MP3 collection to SharePoint.

Site collections also serve as an administrative boundary. Site collection administrators are a special group of users who have complete power over the site collection without necessarily having any access to other site collections. There is an entire menu on the Site Settings page of configuration options that only a site collection admin can make (see Chapter 8). If you have two groups, such as HR and Accounting for example, in the same site collection and one of them comes to you because they need to administer one of these special settings, you will have to do some rearranging. If you make Nicola from Accounting a site collection administrator, then she can fully administer the account site as needed but she also has full control over the entire site collection, including the HR web. You need to instead move the Accounting web to its own site collection and then make Nicola an administrator there.

Site collections are also boundaries for out-of-the-box functionality such as navigation and the various galleries. This can be a drawback of many site collections. Out of the box, it is impossible to enforce consistent, self-maintaining navigation across site collections. The galleries such as the themes, Web Parts, lists, and solutions are all scoped at the site collection level. For example, if you need a list template to be available to multiple site collections, then you have to manually deploy it to each.

Site collections also serve as security boundaries. The All People list and the various SharePoint groups are all scoped at the site collection level and are not accessible for reuse outside of the site collection.

note.ai

Developers and Windows PowerShell refer to a site collection as an SPSite. So when you hear that word, equate it to site collection.

Inside of site collections you have one or more webs. A web is the object that is referred to throughout the user interface as a site. It can also be called a subsite or a subweb. Again, because the term site can be very confusing, whenever possible refer to these as webs. This is the first object users can actually touch. You can apply security to it, and it contains all of the user content. Each web has its own lists (libraries are just a special type of list) and all of those lists store items, which refers to the actual content, such as documents and contacts.

As you look at the hierarchy from Web Applications to Items, remember that it is a one-to-many relationship going down but a one-to-one relationship going up. That is, an item can belong to only one list, a list can belong to only one web, a web is part of only a single site collection, a site collection lives in only one content database, and a content database can be associated with only one web application.

Still a little fuzzy? Try this metaphor to understand how these pieces work together: Web applications are the landfill. Content databases are giant dumpsters. A site collection is a big, black 50-gallon garbage bag. And webs, lists, and items are pieces of trash. Your users spend all week creating garbage, continuously stuffing it in the garbage bags, with each piece of trash existing in only one garbage bag at a time. Each garbage bag can hold only 50 gallons of trash (quotas) before it is full, after which the user has to either ask for a new garbage bag or get a bigger garbage bag. That full garbage bag is placed in a dumpster, and it is not possible to put a garbage bag in more than one dumpster without destroying it. Dumpsters are serviced only by one landfill but that landfill can handle thousands of dumpsters without issue. How was that? Clear as mud?

Controlling Deployments

SharePoint 2010 ships with more than a handful of tools that will help you to keep it under control — from tools that block and/or discover rogue deployments to built-in throttling capabilities that will help to prevent lost data and oversized lists from destroying your farm.

Blocking Rogue Deployments

SharePoint, especially Foundation, is sneaking into more and more enterprises. Business units who don’t want to go through the proper channels have been caught standing up their own SharePoint servers in alarming numbers. That wouldn’t be so horrible, but these rogue servers often house business-critical data but have no backups and no redundancy. IT generally doesn’t find out about them until it is too late and someone has already lost critical data. To help prevent this SharePoint 2010 has implemented a new registry key:

HKLMSoftwarepoliciesmicrosoftSharePoint14.0locksharepointinstall

If you set the dword blocksharepointinstall equal to 1, the installation of SharePoint is blocked. The key challenge is getting this registry key added to all of the machines in your farm in time, as it is not there by default. It will not affect servers that already have SharePoint installed. Also, you need to keep this key a secret between you and this page. If a user knows to look for it they can remove it from the registry and then install SharePoint anyway. If you are considering using this key it is probably easiest to create a group policy object that adds it to all the machines in your domain.

Registering SharePoint Servers in Active Directory

Rogue SharePoint servers have become an issue in many large enterprises, but sometimes blocking them as described in the previous section is considered too drastic. Wouldn’t it be great if you could keep track of every server in your Active Directory that someone installed SharePoint on, so you could find the culprits and smack them on the hand with a ruler? With a little AD work you can. When a SharePoint farm first comes online it will attempt to register itself through an Active Directory Service Connection Point, also referred to as an AD Marker. The challenge is this container is not in AD by default; you must create and configure it before SharePoint is deployed. If you do it after the fact, existing farms will not be registered.

To configure this you must be a domain administrator and have access to a domain controller. Then you will need to follow the steps documented here: http://blogs.msdn.com/opal/archive/2010/04/18/track-sharepoint-2010-installations-by-service-connection-point-ad-marker.aspx.

HTTP Throttling

A potential challenge SharePoint administrators have faced in the past and are certain to see again is lack of resources and the odd behaviors it produces. One scenario is an overworked WFE server. As a WFE is processing requests, it might reach a point where it is not immediately responding to a request due to a lack of resources. It will then begin to queue requests, but it has a limited capacity for storing requests also. If the queue fills up, then it will just start indiscriminately dropping requests until it catches up. While this is not a big deal for a typical GET request, what if you are a user who has just spent an hour taking a survey or filling out an application? If that PUT request is dropped, your hour was spent in vain and you will have no option but to start over.

To avoid this issue, Microsoft has introduced HTTP Throttling to protect a server during peak load. By default, this feature monitors the available memory in megabytes and the ASP.NET requests in queue. As it monitors these counters, it generates a health score for the server on a scale from 0 to 9, with 0 being the best. The monitor checks every five seconds by default. If the score is 9 for three consecutive tests, then the server will enter a throttled state. In this throttled state, SharePoint will return a 503 server busy message to all GET requests, including the crawler if you happen to be indexing. In addition, all timer jobs will be paused, which enables the server to concentrate on finishing existing requests and hopefully makes room for anyone doing a PUT request, like that user who just spent an hour filling out a form. The monitoring continues every five seconds, and throttling is disabled after one occurrence of a score below 9.

This feature can be configured using Central Administration, to be enabled or disabled per web application. Using Windows PowerShell, you can go a step further and view and edit the thresholds using the following cmdlets:

Get-SPWebApplicationHttpThrottlingMonitor

Set-SPWebApplicationHttpThrottlingMonitor

note.ai

You can introduce your own counters, but that requires object model code, a topic outside the scope of this book.

The health score is exposed to all HTTP requests. If you use a tool like Fiddler (www.fiddler2.com) that enables you to inspect your web traffic, you will see in the header under Miscellaneous the value X-SharePointHealthScore. The place this truly comes into play is with the Office clients.

The Office 2010 client programs are aware of the score and can use it to adjust their behavior. For example, if you are using the PowerPoint Broadcast feature (covered in Chapter 18), it knows to watch the health score and to adjust the frequency of its updates based on the score.

Large List Throttling

SharePoint 2010 will support lists up to 50 million items; so much for that horrible rumor that SharePoint only supports up to 2,000 items in a list. That rumor is a case of people not getting their facts straight. Previous versions of SharePoint did have a recommendation to not exceed more than 2,000 items in a list view because of the performance strain it caused your farm. Think about what happened behind the scenes when a user tried to view 3,000 items in a list. First, the SQL Server had to generate a query to return all 3,000 items at once. Next, that information had to be sent to the WFE server and added to the page. Finally, the user had to download the page with its 3,000 items and wait on Internet Explorer to render all of that content. It could literally take minutes to return the page. Sadly, there was nothing to stop users from doing this or even to monitor that activity until now. SharePoint 2010 vastly improves this scenario.

With SharePoint 2010, we have controls that we can configure to prevent these types of activities. Figure 3-7 shows the Resource Throttling screen in Central Administration. You can access this screen by navigating to Application Management  Manage web applications. Then select your web application, click the drop-down for General Settings, and select Resource Throttling. All default settings are shown.

The List View Threshold, which is set to 5000 by default, represents the maximum number of items a standard user can return in a view. As users approach the limit, they will see the screen shown in Figure 3-8, which tells them how many items they have and where the throttling limit is set.

The following relevant settings are available:

  • Object Model Override — This setting specifies whether a developer can override the throttling through the object model code to allow their code to run.
  • List View Threshold for Auditors and Administrators — This setting is used to grant special power users a larger threshold. You can set a user up as an auditor through the Manage web applications screen. You first add a Permission Policy and enable the Site Collection Auditor permission policy level. Then, using User Policy, also on the Manage web applications Ribbon, select the new permission level you created.
  • List View Lookup Threshold — This setting is used to control the number of lookups that can be specified.
  • Daily Time Window for Large Queries — This setting is also referred to as “happy hour.” It allows you to set a time of day when throttling is disabled and views are unrestricted.
  • List Unique Permissions Threshold — This setting limits the number of unique permissions a given list can have. This is a good idea, as you can run into performance problems if a list has too many unique permissions coupled with too many items. Security trimming is a great but expensive feature at times.

The remaining settings are not part of the list throttling feature.

When users exceed this limit, they will see a warning message in the browser stating “Displaying only the newest results below. To view all results, narrow your query by adding a filter.” This will show the last 1,000 modified items.

Summary

In this chapter you reviewed the plethora of SharePoint 2010 SKUs that are available and how each one may be applicable to your situation, except for that cloud business. With that knowledge, key considerations of the other infrastructure pieces in the farm were discussed. Don’t ever overlook these boxes, as they are the key to your success. Remember: No one calls to say your Windows box isn’t working; they only call to complain SharePoint is broken.

In the section on terminology, you learned a bit about SharePoint’s vocabulary, including how evil the word “site” is and why you should avoid it like the plague. Finally, you were introduced to SharePoint’s out-of-the-box tools, which can help you manage its sometimes overwhelming collaboration and content management features.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset