Chapter 14: Configuring and Managing Enterprise Search

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

What’s In This Chapter?

SharePoint Foundation Search
SharePoint Server and Search Server
FAST Search
All the bells and whistles of Search

Who doesn’t need Search these days? In the early days of the Internet search engines were either unknown or considered weird things that the geeks (like the authors of this book) would use to find those cool nuggets of information on the Internet. You know the ones, like how to build your own BBS using 286s or what was the code for invincibility in Doom. Fast forward to today and now everyone and their dog uses search to explore the Internet. And it works great for finding anything and everything you can imagine. It even works just as well to find those Doom codes. Unfortunately, Internet search engines cannot reach inside your corporate network. And even if you buy one of those “Internet search devices” and put it inside your network they generally don’t do a very good job of indexing (cataloging) your corporate data. That’s because the online search engines are optimized for following the billions and billions of links on the Internet and determining relevancy that way. Your intranet doesn’t have the same type of linking, so the results fall short.

The challenge is your users want to search, they even expect it. They don’t want to dig through a file share to find their spreadsheet. Think about it; what is one of the most popular features in Outlook 2007 and 2010? The instant search capabilities of e-mail. Once again, people don’t want to categorize and organize e-mail; they want to have a big pile that search can deal with. If only there was a way to effectively provide search results from the intranet.

Enter SharePoint 2010 in all of its Search glory (hear the trumpets?). For a lot of companies evaluating SharePoint, Search is one of the most compelling reasons to invest in the platform. SharePoint Search not only does a fabulous job of indexing your SharePoint installation, but can reach out and index the rest of your enterprise. File shares, Exchange public folders, other web sites, line of business data, even Lotus Notes databases can all be added to your SharePoint index by the index servers. Servers? Yes, that’s right; SharePoint 2010 has made some major architectural changes, including supporting multiple index partitions and even using multiple index servers to populate them. All of these changes extend SharePoint Server Search’s upper limit to 100 million items. And if you want to go beyond that you should look at one of the new SKUs introduced, FAST Search. With FAST, having 1 billion items in the index becomes a real possibility.

A fancy new architecture and some new SKUs wasn’t enough for the Search team. They have also invested quite heavily in updates to the user experience. Things like wild card searches, support for Boolean operators, and a refinement panel are all now available. You will also see a new Search-specific master page that provides more screen real estate, and AJAX support for rendering Web Parts — and extensible Web Parts this time around.

When you first begin to administer Search, it will seem very familiar as the Search administration pages that were introduced with MOSS 2007 at the infrastructure update have been retained in SharePoint 2010. But don’t overlook this chapter, as there is gold in them thar hills. Several small nuggets have been added to give the administrator more control. Features like setting content priority, search of case sensitive locations, and a whole list of new reporting options will excite even the most jaded administrator. Also, watch for some changes: protocol handlers are out, BCS connectors are in, and introducing claims to the farm. This chapter will help you unlock all of the power of this favorite feature of both administrators and users.

The Different Versions of Search

In Chapter 3 you learned that there are a lot of different SKUs for SharePoint. This chapter looks at the way Search functions in three key product sets:

SharePoint Foundation Search
SharePoint Server and SharePoint Search Server
FAST Search Server 2010 for SharePoint and FAST Search Server 2010 for Internet Sites

FAST Search Server 2010 for Internet Business is not covered in this book. While it shares a similar name, it is a standalone product that has nothing to do with SharePoint.

SharePoint Foundation Search

SharePoint Foundation continues the proud heritage of Windows SharePoint Services as an environment that is easy to deploy, configure, and use. Once deployed, it gives users convenient access to the key features they need to start collaborating. Search is no exception to this. Foundation Search will index all of your SharePoint content and provide basic search results with very little effort. And once you enable the indexing, the administration and UI have no settings to configure — they just work. But don’t gloss over this section; there’s some important info here to help you get the most out of Foundation Search.

Setting Up Foundation Search

Foundation Search does need a little nudge from you, the SharePoint administrator, before it goes into auto-pilot mode for the next couple of years. You need to start the service and ensure that all of your content databases know to use the service you start. Follow these instructions to get things going:

1. Open Central Administration.

2. Under System Settings, click Manage services on server.

3. At the bottom of the list click Start (to the right of SharePoint Foundation Search).

4. Choose the proper service account; if you are doing a least privileged install, then you should create a new managed account for Search.

5. For Content Access Account, enter a username and password as shown in Figure 14-1. Keep in mind:

This should be a unique, dedicated only for Search account.
Always enter accounts in the form domainusername.

Figure 14-1

6. Make any changes to the Search Database screen that you need. For most options, the defaults will work well. Figure 14-2 shows an example.

Figure 14-2

7. If applicable, specify a Failover Server.

8. For Indexing Schedule, you need to choose the proper schedule for your needs. You will need to balance the need for more frequent index updates with the performance capacity of your server(s). If you are unsure, start with the default of once an hour and then adjust from there based on user feedback and system performance. Figure 14-3 shows a default schedule.

Figure 14-3

9. Scroll to the bottom of the page and click Start.

After a minute or two of processing you will be taken back to the Services on Server page and the status for SharePoint Foundation Search will be started. (You may see the name of the service updated to SharePoint Foundation Help Search.) Now, with Foundation Search running, you need to ensure that all of your content databases are assigned an indexer. The indexer will be the server on which you just started the Search service.

1. From Central Administration, go to Application Management.

2. Under the Databases section, click Manage content databases.

3. Click the name of your content database.

4. Scroll down the page to Search Server and select your index server from the drop-down.

5. Click OK. If you have multiple content databases you will need to repeat these steps for each of them.

With the Search Server value updated, the next time the indexer runs based on the schedule you defined earlier, the selected content database will be indexed. If, like your impatient authors, you do not want to wait, you are in luck. You can use the stsadm.exe command shown here to start a full crawl immediately:

stsadm.exce -o spsearch -action FullCrawlStart

With Foundation it is possible to have multiple index servers by simply starting the service on multiple servers. If you do this, you can then distribute the load by content database. And as long as you have your site collections spread across multiple databases, you have a solution that has distributed the load of indexing. Keep in mind, though, that this is not making your indexing high availability. If one index server goes down, then the other server will not pick up and respond to requests in its place. You would first have to set all of the content databases to be indexed by the surviving index server and then run another crawl.

Remember that you have to be in the SharePoint Root folder and then in to run the command. Or you can take the easier way and open the SharePoint 2010 Management Shell, since stsadm.exe is already in the path.

The message “Operation completed successfully.” will return quickly. This does not mean the index is done, just that it is started. If you want to know when it is complete, you can monitor Event Viewer on the server. You will see an Event 85: “A master merge has completed for catalog Search.” Now you have search results.

Search Results

Take a look at Figure 14-4 for an example of Foundation Search results.

Figure 14-4

Here you can see the simplicity at work again. No tabs to navigate search result pages, no refinement panel to drill down based on metadata, no federated results, no advanced search options. Just good old-fashioned SharePoint content search results. To understand exactly where those results come from, Figure 14-5 shows an example hierarchy.

Figure 14-5

If you run a search for Accounting.doc from each of the locations shown in Figure 14-2, you will see the following results:

Site Collection 1 — You will get results from all four sites: Web 1 a, Web 2, Web 3, and Web 4.
Web 1 — You see the results only from Web 1 a.
Web 1 a — You see the results only from Web 1 a.
Web 2 — You see the results only from Web 2.
Site Collection 2 — You see the results only for Web 3.
Web 3 — You see the results only from Web 3.
Site Collection 3 — You see the results only for Web 4.
Web 4 — You see the results only from Web 4.

From the root web at the root site collection, Search will return results from the entire web application. This is the only location that will return all results. Searching from any of the other locations will return results only from that spot in the site collection and down the tree, as shown when you are on Web 1 and you see results for Web 1 a but not Web 2. Web 1 and Web 2 are in the same site collection but they are in different branches.

Why does that root site collection get these magical powers? In fact, it isn’t magic, just fun with query strings. Go to the root site collection and search for test. Then look at the URL http://docs/_layouts/searchresults.aspx?k=test&u=http%3A%2F%2Fdocs. If you break that down starting from the ?, you can see there are two parameters. The K=test parameter means “do a search for the keyword test.” The & means there is another query string coming. The u=http%3A%2F%2Fdocs parameter translates to “return search results for URLs that start with http://docs.” Because every web in the web application starts with http://docs, you got results from the whole web application.

Just for fun, what happens if you manually remove the u=http%3A%2F%2Fdocs from the URL and press Enter? You get results from the entire farm. Very interesting; maybe you should add that to your secret SharePoint hacker notes.

Remember that the search results you see, even if you are manipulating the query strings, will only be items to which you have permissions.

Security Trimming

SharePoint will only show you search results for content you have access to, and for which SharePoint understands the security. This is called security trimming. For example, when you index your Windows file share, SharePoint can match your AD permissions on the share to the AD account you are logged into SharePoint with and trim your search results. But if you set up SharePoint to index an external source, maybe using a cookie or a secret anonymous back door, SharePoint doesn’t understand these permissions. It will then show you all of the results for that source. If you need to have security trimming for these external repositories, you should look into developing a custom security trimmer. That is another one of those “developer” topics that is outside the scope of this book.

User Interface Features

A couple of user features are worth noting. Wildcard and Boolean searches work with Foundation and are covered in greater detail later in the chapter in the “SharePoint Server and Search Server” section. This means you can do searches like share* or something fancy like “Human Resources” AND policy. This former does a search for anything that begins with share. The latter will search for anything that has the words Human and Resources together and has the word policy in it.

Foundation will automatically create contextual scopes for you. A contextual scope can help you narrow down your search results. It enables you to do a search of This Site or This List. To access the contextual scope for This List, navigate to the list and then do your search from the Search box at the top of the page. It will default to searching your current list. You can then click the drop-down menu to select This Site. Interestingly enough, if you look at the URL when you search This List, you will see the same u=<your URL location>, once again opening the door for some search results manipulation if necessary.

Site Search Administration

If you look for Search settings from the Site Settings page, you will not find much. The only search-related option is Search and offline availability. This setting allows you to control whether the current web is included in search results and how to handle any ASPX pages you may be using.

That’s it; you are done with your tour of Foundation site search administration. Clearly, there are a lot of positives here; but keep reading. The next section covers SharePoint Server Search and Search Server. As you drool over those features, don’t forget that the Express version of Search Server is free, and you can bolt it right on top of Foundation with ease. Wow — a free solution and a more awesome Search.

SharePoint Server and Search Server

This section covers the following products:

SharePoint Server 2010 Standard
SharePoint Server 2010 Enterprise
SharePoint Server 2010 for Internet Sites Standard
SharePoint Server 2010 for Internet Sites Enterprise
Search Server 2010 Express
Search Server 2010

This is the money section of the chapter. Most readers probably have one of the aforementioned products or are bugging their bosses to get one. Foundation Search is great for getting started, but it lacks the level of control you may be hoping for. FAST Search is amazing, but its price tag can be a tough hurdle to overcome in smaller environments — so that leaves you here, in a very nice and comfortable place.

Search Server versus SharePoint Server

A very common question that first pops up in this conversation is “If I have SharePoint Server what do I get by adding Search Server?” The answer is simple: nothing at all. Search Server is only a subset of the functionality available in SharePoint Server and cannot be installed on an existing SharePoint Server installation.

An example of a key difference is that SharePoint Server can index Active Directory information about your users after you configure and do a profile import, which is covered in Chapter 17. While Search Server can index SharePoint sites, it does not have a mechanism for doing the profile import from Active Directory, so it is unable to index user information. We will note similar limitations on Search Server throughout the chapter; otherwise, assume Search Server can perform the covered feature.

The follow-up question is “What is the difference between Search Server and Search Server Express (SSX)?” Again the answer is simple: scale. SSX can only be deployed on one server in the farm. You cannot add more servers to make Search high availability. Search Server can be scaled in the same fashion as SharePoint Server, providing high availability for search and the capability to scale to somewhere in the ballpark of 100 million items. Yikes! Of course, that power comes at a price. Express is free, whereas regular Search Server is not.

Configuration and Scale

In Chapter 3 you took a good look at farm topologies and scale points. Noticeably absent from that chapter was a detailed discussion of Search. That wasn’t author laziness; the Search team at Microsoft chose to build their own tools for configuration of their service application. To access this tool, go into Central Administration ⇒ Manage service applications and click on your Search service application. At the bottom of the administration window you will see the screen shown in Figure 14-6.

Figure 14-6

Here you can view and modify all of the wonderful Search components. You want scale and high availability? Well, here it comes by the truckload. As indicated in the figure, there are four sections in the Search Application Topology: Admin, Crawl, Index Partition, and Databases. The first three are each addressed in the following sections. The various databases are associated with the various other components so they are discussed throughout as relevant.

Admin

In the Admin section of this screen you will find the Administration component. This is the boss of Search. It tells all of the other components and servers what to do by managing the topology. This component cannot be made redundant but that is okay; if this server is offline, then the rest of the servers will continue serving their role. No changes to the Search topology can be made while this server is offline. This server is responsible for such items as starting crawls, reassigning crawl tasks if it finds a crawler unavailable, and similar tasks.

To store all of this information, this component uses the administration database. This database has all of the search configuration information, so when you learn how to create a new crawl rule, this is where you will find it.

A final note about the Admin component: It cannot be readily moved to a different server, so it will live forever on whatever server you first provision it on. This might affect your planning if you are very particular about what is hosted on which server.

Crawl

You might think of the Crawl component as your indexer. This is the piece that will connect to your content, bring it down to the server, generate the index, and extract the necessary metadata. Notice I did not say the crawl component is your index server. This is because one crawl server can host multiple crawl components.

The big change from MOSS 2007 is that the crawler does not store a copy of your index. Instead, the crawler is stateless. It simply marks the content as crawled in the crawl database and then pushes the changes for the index off to the appropriate query server. Additionally, it will take all your search property information and push it off to the property store database.

The Crawl component keeps track of what it needs to crawl and what has been crawled in the crawl database, along with the crawl schedule and other details necessary for crawl operations. And the exciting part: You can have multiple crawlers assigned to the same crawl database. For you MOSS 2007 fans, this means no more relying on only one index server to build your index; now the sky is the limit regarding how much hardware you can throw at creating the index. Another benefit of the crawler having a dedicated database is it does not add load to the property database while crawling.

By default, if you have more than one crawl database associated with a service application, the load is spread between the databases by host name. Using host distribution rules, it’s possible to specify that a certain host (think content source like http://portal or \servershare) is specifically tied to a crawl database. And because you assign Crawl components to specific crawl databases, you can now ensure that you have your most powerful crawlers working on that database. You may even choose to have that crawl database on a dedicated SQL Server.

If you have multiple databases and you want to find out what hosts are in what database, you can do that in the crawl log. Details about this cool capability follow later in the chapter.

Index Partition

You just learned about crawlers, and how they create an index but don’t store the actual index. The storage is actually done by the Query component. The Query component is responsible for responding to search queries. When a user on a SharePoint site types “Cow” in the search box and hits Search, the web server hands that off to the Query Component server, more often than not just called the query server. The query server then digs through the index and property database to come up with a list of items for the search. Security trimming then takes place, and finally the web server renders those results back to the user.

If you want to add scale, you can actually divide the index into multiple partitions, or pieces (as described later in this chapter). That way, you can assign each partition to a query server. For example, if you have one million items in your index prior to partitioning, it might take one second to find your search results. If you divide that into two partitions and put each partition on its own query server, your index still has one million items in it but each query server has only 500,000 items in its partition to look through. Now your query results can be aggregated and returned to your browser in .5 seconds. That is how you scale the query servers for faster results.

An important threshold for an index partition is 10 million items, the maximum number supported in a partition. Also, remember that each time you want to introduce a new partition you need to introduce a new query server. Very little is gained, and more than likely you actually will decrease performance, if you have only one query server and you try to break your index up into two partitions with both living on the same query server. Unlike the crawl databases that are divided up by hosts, the index partitions try to maintain a very close balance. So each item is sent to an index partition based on a hash of its document id. This method provides better scale with query partitions.

Now you have two query servers but each one has half the index (its own partition). Next you need to configure redundancy. Partitions can also have mirrors. The mirror partition can be configured to respond to queries only if the primary partition is unavailable, or it can be a fully functional mirror that responds to queries. The balancing of query traffic is handled by the Search Admin component and is automatic. Typically, your index partition will be served by only one Query component, and configured with a failover mirror.

The final piece here is the property database. This database stores all of the metadata associated with the index partition(s) to which it is connected. An index partition is associated with only one partition database, but a partition database can be connected to multiple index partitions. This SQL Server database can become a bottleneck over time as it grows. If that is the case, you can either move the database to a bigger, badder SQL Server or reduce the number of partitions associated with it.

Adding a Server to the Search Topology

Consider a scenario in which the server farm is fully configured with everything, including SQL Server, running on one machine. Another server, ServerRC, has been purchased, has the same version of SharePoint Server 2010 Enterprise installed, and is added to the farm. The initial configuration wizard has been run on the new server. This started the appropriate services on this server. To add the second server to your Search topology, follow these steps:

1. Open Central Administration ⇒ Application Management ⇒ Manage service applications.

2. Find your search service application and open the Manage interface. Remember that Search topology is defined per Search service application if for some reason you have more than one.

3. Scroll down the page and click Modify (refer to Figure 14-3).

4. Click New, and from the drop-down select Crawl Component.

5. For Server, select your new server’s name. For this example, it is ServerRC.

6. For Associated Crawl Database, select the Crawl Database from which you want this crawler to work.

7. If necessary, change the Temporary Location of Index. This location will only be used for creating the index updates before pushing them out, and it should remain relatively small. It will not increase in size as your index grows. Check out Figure 14-7 for an example and then click OK.

Figure 14-7

8. You are returned to the Manage Search Topology screen, where you will see Pending creation next to your new component. Click the Apply Topology Changes button at the bottom of the screen, unless you plan to also add the Query component in the next set of steps. If so, skip this step. A processing screen will appear and process for a few minutes. Once it is complete, you are all set.

You now have configured the two servers to share the load of the one crawl database. The next logical step is to configure your new server to also be a query server. With the second Query component, you will get a second index partition, so you will want to define a mirror for each of your two partitions:

1. Return to the Search administration screen and click the Modify Search Application Topology button.

2. Click New. From the drop-down, select Index Partition and Query Component.

3. For Server, select your new server.

4. For Associated Property Database, choose the database you want this query component to use. You haven’t created any additional ones, so there should only be one item in the list.

5. Location of Index is an important consideration. This is where the physical index files will be stored on the server. Ensure that you have enough storage capacity in your chosen location. If at all possible, this should be on its own dedicated drive.

6. Leave the Set this query component as failover-only at its default setting of unchecked as illustrated in Figure 14-8.

Figure 14-8

7. After you confirm your settings, click OK. This will automatically create Query component 2.

8. Now you have the two partitions you need to set up the mirrors. Hover over Query component 1, click the drop-down, and select Add Mirror.

9. For Server, choose the server that is currently not hosting this partition.

10. Confirm that your Index location is correct. (Remember that the C: drive is a bad place.)

11. Check the box for Set the query component as failover-only.

12. Click OK.

13. Repeat steps 8–12 for Query component 2.

14. You are returned to the Manage Search Topology screen. You will see Pending creation next to your new component. Click the Apply Topology Changes button at the bottom of the screen. A processing screen will appear and process for a few minutes. Once it is complete you are all set.

Now both servers are participating in serving Search queries and helping to crawl all of the content. You also have solid redundancy. In most environments the preceding actions will be sufficient. You have the capacity to crawl a lot of content in a reasonable amount of time and your Search components are high availability. Note that this does not include SQL Server. It is up to you to implement a high-availability solution for the databases, whether that is SQL Server clustering, taking advantage of the database mirroring support, or some third-party solution.

Scaling Up with Crawl Databases

Fast forward a little bit and your SharePoint deployment demands have increased again. You now want to add the crawling of your very large file server. Because of the size and nature of the data, you expect the crawling burden to be very high, so you choose to add another crawl database running on a dedicated SQL Server. You will also make this a dedicated database.

1. Return to the Search administration screen and click the Modify Search Application Topology button.

2. Click New and select Crawl Database.

3. For Database Server, enter the SQL Server you want to host this database. It can be the same SQL Server the rest of your farm uses, or if you’re trying to add scale because of performance constraints on your current SQL Server, it may be a dedicated SQL Server.

3. Set Database Name to anything you would like.

4. Enable the checkbox for Dedicate this crawl store to hosts as specified in Host Distribution Rules, as shown in Figure 14-9.

5. Leave the other fields as is and click OK.

Figure 14-9

At the bottom of page you selected the option to Dedicate this crawl store to hosts as specified in Host Distribution Rules. This rule tells the database to not store anything that is not specifically added by a host distribution rule, which you will create in the next section. If you do not make this crawl database a dedicated database, then Search will automatically balance the load in this database with the other crawl database. Don’t forget to click Apply Topology Changes once you are done making updates to your topology.

If you were to now go straight into adding a host distribution rule, you would not see your new crawl database listed. That’s because you have not associated your new crawl database with a crawl component, making it useless. To fix this, you need to follow the previous steps for creating a new crawl component, but this time select the new crawl database you created. Do this on Server1 and ServerRC.

Adding a Content Source and Host Distribution Rule

In these steps you will add a file share content source and then add it to the crawl database you specified earlier:

1. Go to the Search Administration page.

2. On the left side of the page, click Content Sources.

3. Click New Content Source.

4. Specify a Name.

5. For Content Source Type, choose File Shares.

6. For Start Addresses, enter the UNC path to the share(s) you want to crawl — for example, \FileServerShare. Note that the search crawl account needs to have “read access” to the share(s) being crawled.

7. For Crawl Settings, the default is normally correct. Crawl the whole share, not just the root folder.

8. For now, leave the crawl schedule set to None. (Crawl schedules are covered later in the chapter.)

9. Content Source Priority gives you the opportunity to mark a content source as high priority. This way, if overlapping content source crawls are taking place, you can specify which should have priority.

10. Skip over Start Full Crawl. You will do that the old-fashioned way in a moment.

11. Click OK. Figure 14-10 shows a sample configuration.

Figure 14-10

Creating a Host Distribution Rule

Now your file share content source is created. Before you start that full crawl, you need to set up your host distribution rule:

1. On the left side of the screen, click Host Distribution Rules.

2. Click the button for Add Distribution Rule.

3. For Hostname, enter FileServer. (Do not use slashes, just the actual host name. For example, if you had a content source of http://portal.contoso.com, your hostname would be portal.contoso.com. FileServer is used as the hostname here to keep up with the previous file share configured for \FileServerShare.)

4. From the Distribution Configuration, select the crawl database that you created in the earlier section.

5. Click OK.

6. Click Apply Changes. This will check to determine whether any content must be moved from one crawl database to another to comply with your new rule. If so, you are warned that this takes time and that any active/pending crawls will be paused for the duration of the move. Click the Redistribute Now button when you are ready to commit to the changes.

Starting a Crawl

With all of that done you are now ready to do a crawl of your content sources and watch them split up across the databases:

1. Click Content Sources on the left side of the screen.

2. Hover over File Share (your content source), click the drop-down, and select Start Full Crawl.

3. Click Search Administration on the top left.

4. Now you can get a nice can of Mountain Dew, and sit back and watch the crawler go.

Perfect! Now you have your entire file share in one dedicated crawl database with two dedicated crawlers. Keep in mind that your dedicated crawlers are still on the same crawl server as the other crawlers. If you needed more scale, you could introduce more servers into the farm, create new crawl components on those servers, and then assign those crawlers to this crawl database and remove the current two. Scaling up is as flexible as Silly Putty.

Matching Crawl Databases to Hosts

For the final trick when it comes to playing with crawl databases, you need to look at the crawl logs:

1. On the left side of the Search Administration page, click Crawl Log.

2. From the top menu bar, click Host Name.

Behold! All of your crawl databases are listed, and each one shows what hosts are included in the database.

Take a gander at Figure 14-11. It doesn’t reflect the preceding steps, but rather includes some interesting things to test your knowledge.

Figure 14-11

There are three crawl databases. Search_Service_Application_CrawlStoreDB_ e2375287809744a28811d81f75273870 is the original crawl database that was created using the Initial Farm Configuration Wizard. The “Initial” in its name is a good reminder of its limitations. SearchCrawlDB1 and SearchCrawlDB2 were manually created using the Modify Topology button. SearchCrawlDB2 was configured to Dedicate this crawl store to hosts as specified in Host Distribution Rules.

Looking at the hosts, you can see content distribution at work. There are six content sources. Server3 has a host distribution rule to force it into SearchCrawlDB2. The remaining five were spread across the remaining two databases. Three of the content sources begin with sp911rc, but because they are separate sources, based on the port, they are divided accordingly.

At the top of the page there is also a link that says “If you would like the system to analyze your current distribution and make recommendations for redistribution, click here.” Clicking that button on this server produces the report shown in Figure 14-12.

Figure 14-12

That’s rather impressive. Search looked at how your hosts were currently distributed versus the amount of content in each and suggested changes to better balance the databases. Keeping perfect balance is very difficult, as each host has to reside in only one crawl database; but in an environment with many hosts, this can go a long way. At the bottom is a Redistribute Now button if you want to have the changes implemented for you. If you click this button, SharePoint will automatically configure new Host Distribution rules for you and update the crawl databases as necessary. Don’t forget that all crawls are paused while this process runs.

Once the rules are created, you will be brought back to the Host Distribution Rules page. Here you will see a Redistribution status across the top of the page, with a percentage complete. The page will automatically refresh every 10 seconds while the distribution runs.

After everything is done you can return to the Auto Host Distribution page and let it check again. You will see something similar to Figure 14-13.

Figure 14-13

Adding a Property Database

Now imagine that after looking at your query performance you find that your property database has become the bottleneck. Your overabundance of metadata and SQL disk I/O have combined to slow things down. Time to add a new database:

1. Open Search Administration.

2. Scroll down the page and click the Modify button under Search Application Topology.

3. From the toolbar, click New and select Property Database.

4. The defaults here are typically good, but if you want to give the database a new name or have it hosted on a different SQL server, make those changes now. Once you are done click OK.

Now the database is created, but it is still not in use. You have to first associate it with a Query component:

1. Click Query Component1, and from the drop-down select Edit Properties.

2. For Associated Property Database, click the drop-down and select the new database you created.

3. Click OK.

Now you are still in an awkward position. When you change a Query component to be associated with a new property database, a new index partition is created as a by-product. That’s because the index partition is associated with a specific property database and cannot be changed. This means that you now need to reevaluate your index partitions. For example, the partition you just created doesn’t have a mirror. You need to add a mirror to it. And the old partition is gone but the mirror of that partition is still floating out there associated with the wrong property database. Once you get everything straightened out, be sure to apply your changes.

The Search UI

After you put so much work into configuring your topology and then working through the administration interfaces, it’s easy to assume you are done. Don’t clock out quite yet. While the UI is a wonderful thing that will “just work,” there is so much more you can get out of it with a little understanding and tweaking. Even more exciting is the fact that you can delegate this work to a site collection administrator. The following sections describe some of the ways you can tweak the UI.

The Search Box

Everyone knows how to use the Search box: You enter your search query, hit Search, and then get the results. Pretty straightforward — but as noted in the SharePoint Foundation section, you can do a handful of cool things in this box:

Wildcard searches — Wildcards enable you to broaden your search by using symbols to represent characters. For example, you can simply type Sh* to search for all words that begin with the letters Sh. Note that the wildcard search works only for the end of the word. You cannot search for *point only share*. Also, keep in mind that while wildcard search can help you find more good results, it is also going to return more bad results. Relevancy is greatly reduced when search for wildcards.
Boolean searches — This searching method enables you to narrow or broaden your search using terms such as AND, OR, and NOT. It is important that you capitalize the Boolean terms properly. Also worth noting is the use of “ ” around phrases. For example, you could do a search such as (“Accounting Policy” or “Accounting Procedures”) AND Termination. This would return all search results that have either Accounting Policy and Termination or Accounting Procedures and Termination.
Range refinements — You can do range refinements using the =, >, <, <=, and >= operators. The previous version of SharePoint accepted these operators to help you refine property restrictions; it just didn’t do it very well. Who knew those could be used for something more than making emoticons?
Property searches — For years we have had a property search capability but it was apparently secret. In the search box, you can type title:“Vacation policy” or author:Shane and do a search on specific properties. Any of the Managed Metadata properties can be used. They are discussed later in the chapter.

Relevancy Improvements

Every iteration of a good search engine improves the magic that drives search results, and SharePoint is no exception. Although most of the updates are closely guarded secrets, there are a couple that can be shared.

Phrase matching support has been added. For example, when you search for sales presentation, results with sales and presentation together will be ranked higher than results with sales and presentation in the document but not together.

Clickthroughs count. A clickthrough is the way the search page captures your activity. When you do a search and get back results, Search continues to monitor your activity by noting which links you click. For example, if you search for policy, and after reviewing the list of files you click on the third document, SharePoint makes a note of that. Over time, if people searching for policy continue to click on the third document, SharePoint will adjust that document and return it higher in the results. This is a pretty powerful feature, driving better search results as your users simply do their normal activities.

In Chapters 16 and 17 you learned about different ways of adding metadata to documents. One of the features was social tagging. Whether it is on pages, documents, or entire sites, tags are a help to Search. Search looks at these social tags and gives increased weight to tags, especially if the same content is tagged repeatedly with the same tag. Once again, Search knows your users matter and it updates its indexes to reflect their activities.

Refiners

When you do a search, notice the list of properties on the left-hand side of the page, as shown in Figure 14-14. These are called refiners. For example, you can click on Word under Result Type and your search results will be narrowed down to only include Word documents. You could then click on a specific author to further refine your results. This list of refiners is built from the first 50 search results, meaning it is not all inclusive if you have a large set of results. A small note if you were using FAST Search — the refinement panel is based on all the search results, not just the top 50.

Figure 14-14

Search Alerts and RSS Feeds

Sometimes you might need to do the same search repeatedly. And while the search page is pretty cool and you enjoy checking it every day, repeating a search may not be the best use of your time. A better option would be to click the search alert icon (labeled 1 in Figure 14-15) to get search alerts. This way, every time the search results are updated for your query, SharePoint will send you an e-mail. You could also use the RSS Feed icon (labeled 2 in the figure) to subscribe to an RSS feed of your search results.

Figure 14-15

Windows 7 Desktop Search Add-on

If you perform your search from a Windows 7 machine you will see the Desktop Search icon (labeled 3 in Figure 14-15). Clicking this icon will add a search connector to your Windows 7 desktop Search. With this connector you can search your SharePoint site right from your Windows machine. (You will see your SharePoint site in Explorer under Favorites.)

View in Browser

If you have the Office Web Applications installed (see Chapter 19), the View in Browser link will appear, giving your users the option to quickly view the document in the browser without having to download it. Functionality previously only available with third-party hardware now just works out of the box with no effort on your part.

Query Federation

Query federation enables you to add search results from any OpenSearch-compliant search engine to your SharePoint site. These results appear in a separate Web Part on the right-hand side of the screen and are not intermixed with your SharePoint results. Also, this Web Part is asynchronous by default, which means it will load independently of the rest of the page, so you aren’t waiting on it to get your SharePoint results. For example, you might set up a special search page for your research group that searches your SharePoint indexes and Bing at the same time, helping the group to discover information quicker and with one search instead of two.

This federation is also very useful in scenarios where a company is geographically dispersed and has multiple SharePoint farms. Often these companies want to have search results from all farms but don’t want the hassle and expense of having SharePoint crawl across the WAN. Instead, they set each farm to crawl itself, and then use Search Federation to display results from both farms on the same page. Remember, though, these are two separate sets of results and will not be combined.

Extensible Web Parts

Extensible Web Parts sounds an awful lot like a developer topic, and for the most part it is, but as a good admin you should be familiar with some of the options.

The first option is done through the browser. By editing the page and then modifying the search results Web Parts, you can introduce custom XSLT to make search prettier. Additionally, you can modify the Config XML to control what properties are returned with the search results.

From a pure, “I only use Visual Studio type” developer perspective, there are two major changes to note. First, most of the search Web Parts are now public, so developers can tap into them and extend functionality. A great example of this is what the FAST team did. When you add FAST Search, you are just using the normal SharePoint Search Web Parts with FAST bolted on top of them. This reduces their development time and your administrative learning curve because the Web Parts have a very familiar feel to them. The second thing to note is that there are no more hidden query objects. In SharePoint 2007, the communication between the Web Parts was not accessible by developers, so if they wanted to add a Search Web Part to the page they would have to perform their own query for search results instead of taking advantage of the results being used by the out-of-the-box Web Parts.

Did You Mean…?

The “Did you mean…” feature offers suggestions based on what you have searched for. Figure 14-16 shows the user searched for sahrepoint, and even though there were no results, Search suggested sharepoint. If you click the link, the search will be re-run with sharepoint in place of sahrepoint. The downside of this functionality is that it isn’t configurable.

Oddly, if you are trying to find this Web Part in the list, look for Search Summary.

Figure 14-16

Search Suggestions

As shown in Figure 14-17, Search will offer suggestions as you type. It “learns” to offer these auto-complete suggestions over time by tracking the searches of users.

Figure 14-17

Search Administration

There are two places to administrate SharePoint Search. At the site collection level, site collection administrators have a set of tools and settings they can make for just their site collection. At the service application level you can also administrate settings that affect all site collections associated with the service application.

At the Site Level

When you specify someone as a site collection administrator, you give them a world of new buttons and knobs to operate. An important set of these knobs is for Search. These knobs are all located under the Site Collection Administration section on the Site Settings page.

Search Settings

The first option is Search settings. From Search settings you can specify what Search Center to use for the site collection, how the drop-down box should behave, and what search results page you would use if you did not have a Search Center defined. Interestingly enough, for most templates you do not get a Search Center by default, so even though you have SharePoint Server you are using the Foundation Search UI. Yucky. Let’s look at how to fix that:

If you are unfamiliar with the term, a Search Center is a special SharePoint web template customized for search. It is preconfigured with a search page, a search results page, and it uses a special master page. This master page maximizes the screen space for displaying search results.

1. Create a new site collection using the Team Site template at http://yourwebapp/sites/st.

2. Open the site collection as a site collection administrator.

3. Click Site Actions ⇒ New Site.

4. Choose Basic Search Center as the template.

5. Set the name to Search Center.

6. Set the URL to SearchCenter.

7. Click Create.

Now you have a Search Center ready to use; it just needs to be connected:

1. Click Site Actions ⇒ Site Settings.

2. Under Site Collection Administration, click Go to top level site settings.

3. Under Site Collection Administration, click Search settings.

4. For Site Collection Search Center, select Enable custom scopes (such as “All Sites”) by connecting this site collection with the following Search Center:, and enter /sites/st/SearchCenter in the box.

5. For Site Collection Search Dropdown Mode, select Show scopes dropdown.

6. Confirm that your settings match those in Figure 14-18 and then click OK.

Figure 14-18

7. Test it out by navigating to the root of your site collection and doing a search from the box at the top of the page. If you get search results from the Search Center you just created, you are all set.

If you try to create an Enterprise Search Center using the previous steps you will get an error message. To use this template you must first activate the site collection feature SharePoint Server Publishing Infrastructure.

Search Scopes

The next setting in the Site Collection Administration menu is for Search scopes. Scopes are covered later in the chapter in the “Queries and Results” section. This is the menu you use to determine what global search scopes you will use in your site collection or to create your own specifically for this site collection.

Search Keywords

From the Search Keywords screen you can add a keyword and then associate best bets with the keyword. This is best explained with an example. You get back from a company trip to Hawaii for the SharePoint is Awesome Conference and try to find the blank expense report. You open SharePoint and do a search for “expense report” and get about 5,000 results. Yikes. Somewhere in there is the blank report along with the HR policy covering what is acceptable for reimbursement. Good luck finding those needles in the haystack.

To avoid this, you can set up a keyword called “Expense Report.” With the keyword you can add a definition like “You have three days to submit these to accounting with your manager’s signature to get reimbursed.” Then you can associate best bets with this keyword and definition.

A best bet is a link to content that is most likely to be what the searcher is looking for. So you would have best bets to the blank expense report and the policy file. Now when you search for “Expense Reports,” you will see something similar to Figure 14-19. Note that keywords and best bets are defined per site collection, which might alter your planning for their use.

Figure 14-19

At the Service Application Level

The Search Administration page on the Search Service Application is your one-stop-shop for all things search-related. On this page, you’ll find the System Status section, which provides you with a report of your search status. Below the System Status report is the Crawl History. This provides you with a report of the most recent crawls, including what was crawled, what type of crawl it was, when it started and ended, how long it took, and the number of successes or errors encountered during the crawl. Below the Crawl History is the Search Topology section, which gives you an overview of the various Search components in your farm. Figure 14-20 shows the Search Administration page.

Figure 14-20

Along the left side of the Search Administration page are links for setting up the different configuration options for Search in your farm. These links are divided into four categories: Administration, Crawling, Queries and Results, and Reports. The following sections briefly cover each of these links.

Administration

The Administration category contains two links. The first link, Search Administration, as you may guess, is a link to the Search Administration page. When navigating through the Search settings, this link can take you back to the home page for administering Search. The second link, Farm Search Administration, takes you to the high-level administration page for setting up components of the farm’s Search.

Crawling

This is where you will be spending the bulk of your time as you configure the Search Service Application to crawl content in your farm, as well as check the status of previous crawls, set up crawl rules, manage your index, and configure the file types that should be crawled, among other options. The following list outlines the available Crawl settings.

Content Sources — SharePoint can’t crawl what it can’t find. Use the Content Sources link to define what SharePoint will be crawling. Lucky for you, SharePoint was nice enough to automatically create a default content source for you, which includes all your existing SharePoint web applications, as shown in Figure 14-21. (Any web applications added after Crawl is configured are also automatically added to this default source.)

Figure 14-21

You can create a new content source by clicking the New Content Source link on the toolbar. You are not limited to crawling SharePoint sites, however. SharePoint 2010 enables you to create six different types of content sources:

SharePoint sites — You can set up a separate content source for SharePoint sites other than the default content source. This can be helpful if you need to create separate crawl schedules for different web applications.

If you are using claims authentication on the SharePoint web application, the claim is stored. If you are using NTLM, the ACL is stored. The exception to this is when the ACL exceeds 64KB; in this case, Search will automatically convert it to a claim to avoid problems with an oversized ACL.

Web sites — Non-SharePoint websites can be crawled and indexed by SharePoint Search, and made part of the Search index. For instance, maybe your organization uses SharePoint to host its intranet, but the public-facing Internet site is a traditional website. Because useful information is also posted on the public site, you could set up a crawl source of that website to include in SharePoint Search results.
File shares — SharePoint Search isn’t limited to crawling only websites. You can also provide a path to a shared network drive to index the files and content there. This can be helpful for organizations that have a large amount of content on a network share. If a wholesale migration of that content into SharePoint isn’t practical or feasible, crawling the share can be a handy way to provide easier access to those files.
Exchange public folders — SharePoint knows how to talk to Exchange to index public folders. In addition, Exchange 2007 and 2010 have change logs that SharePoint can access, enabling it to perform true incremental crawls against these sources.
Line of business data — This option is similar to the Business Data content source option from SharePoint 2007. If you have an Enterprise license for SharePoint 2010, you can search external data sources you have set up within SharePoint. You can crawl all external data sources or select specific data sources to be included in the content source.
Custom repository — In SharePoint 2010 you can connect to additional content sources by creating your own custom connectors. Protocol handlers from MOSS 2007 have been deprecated and replaced with these connectors. The best part is that the connector framework is common across SharePoint. The same technology that allows the BDC to connect to external sources is used by Search.

Once you’ve specified the name of your new content source and configured the options, you are essentially ready to go. You can also create a crawl schedule when you create the content source, or set it later. Any content source can be edited later by clicking its name on the Content Sources page (or by clicking the drop-down around it and selecting Edit). You can’t change the content source’s type once it has been set, however.

From this page, you can also start the crawls of your various content sources by clicking the drop-down menu for the content source and selecting the type of crawl you want to perform. During a crawl, you can monitor the progress from this page as well.

Types of crawls — Setting up a crawl schedule is one thing, but your Search Service is just going to sit there twiddling its thumbs until it knows when it’s supposed to do something with those content sources you created. That’s where a crawl schedule comes in handy. Setting a crawl schedule tells SharePoint when and how often to crawl a content source, and what type of crawl to perform.

Two types of crawls can be scheduled — a full crawl or an incremental crawl. A full crawl is one that crawls every bit of content it can find on the web service, and keeps crawling until there is nothing left to crawl. Because full crawls cover all content in a content source, they can be fairly lengthy — especially if you have a lot of content. Conversely, an incremental crawl is generally much faster. It crawls only content that has been changed since the last crawl was performed. It does this by referencing the change log. Incremental crawls typically run much more often than full crawls.

Setting a crawl schedule — When creating or editing a content source, you can set the crawl schedule at the bottom of the page. If no crawl schedule is set, click one of the Create schedule links. You can choose from daily, weekly, or monthly (see Figure 14-22). The specific settings vary according to the option chosen. You can get pretty granular when setting up a crawl schedule. Full and incremental crawls can be run on different schedules. A general rule of thumb is that you want to set your crawl to run during a low-usage time for the sites, such as very late at night or on a weekend, when traffic to the site is typically low, especially for a full crawl. Incremental crawls can run more frequently, and it’s usually recommended to do so to keep the search results fresh. Also, it’s best to avoid running crawls during backup times to prevent unnecessary server strain.

Figure 14-22

Crawl rules — By default, Search is eager to go out and crawl everything it can find. That’s awfully generous of it, but you may want to restrict some places. You can do this by setting a crawl rule to exclude content. (Crawl rules can also be used to include specific content in an area that has otherwise been excluded from search.) Once you tell SharePoint which URL it should exclude (or include) and set a few additional parameters, you will have a newly created crawl rule. You can even set up a crawl rule to crawl a specific set of content with an account other than the default search account. This can come in handy if you need to crawl a site using basic authentication — simply set up a crawl rule to use the basic authentication account to search the site.
Crawl log — The crawl log is a detailed report of the crawl activity in your farm. If you notice your search results seem a little “off,” you should head to the crawl log to see what’s going on. SharePoint keeps track of all the items it is able to reach successfully, which content it had trouble reaching, and which areas it could not reach. You can use the links at the top of the Crawl Log page to filter and drill down into your crawl results.
Server name mappings — Server name mappings are used when search results display a path to a file that may cause access issues, or when the actual location of a file pulled into Search shouldn’t be revealed to users. For instance, you may have a shared drive mapped but do not want to display the actual path to that drive for security reasons. You could set up a mapping to change how SharePoint displays the path to that file to users performing the search.
Host distribution rules — In farms with more than one Search database, you can use this page to set a specific host for a crawl database. You can use this for optimization or organization purposes. However, you won’t be able to set any rules if your SharePoint farm has only one database. These were covered in the earlier section “Search Topology.”
File types — This lists all the types of documents (by file extension) that SharePoint is set up to include in its search index (see Figure 14-23). The list is quite extensive — nearly 50 file types are included out of the box. Common file types such as the Office file types and web file types (such as HTML) are included. You can add a new file type by clicking the New file type link on this page. One commonly used file type you won’t see listed by default is the PDF file type. You need to add this file type to the list of files SharePoint should index (and it would be beneficial to install a PDF iFilter in order to allow Search to index the contents of PDF files).

Figure 14-23

Index reset — Generally speaking, SharePoint Search works just the way it should. However, sometimes a change is made on the SharePoint server that prevents Search from working correctly, or it just isn’t behaving the way it should. Or, maybe you’re noticing more errors than successes in your crawl logs. In these cases, you may need to reset the search index, which completely deletes everything in the index, including the search property database, until a full crawl is run. Usually you would want to use this as a last resort, especially if performing a full crawl takes a massive amount of time in your environment.

This page also gives you an option to deactivate search alerts during the reset. This prevents you from flooding your users’ in-boxes with search alerts (if they’ve signed up for them) as the site is re-indexed. You’ll need to reenable the alerts once the site has been recrawled. (This is done from the Search Administration page. In the System Status section, click the Enable link next to Search alerts status.)

Crawler impact rules — SharePoint provides administrators with a way to control the impact that Search has on the server through the use of crawler impact rules. Even on powerful hardware, the search process can heavily tax a server or even the farm, depending on how your farm is configured. Although it is best to run crawls at times when very few people are using the server, this isn’t always possible. That’s where crawler impact rules can come in handy. If you have a heavily used site, you can tell SharePoint to throttle itself back a little when crawling that site. Conversely, if you have an extremely large number of documents on which you want to perform a full crawl as fast as possible, you can use crawl rules to help with this too.

You have a couple of options when setting crawl impact rules. You can increase or reduce the number of simultaneous documents requested by search at a time. This can impact server performance, so pay attention to how SharePoint behaves when changing this setting. If you notice significant slowdown during a crawl, you may want to lower the number of documents requested. You also have the option to request one document at a time and specify an amount of time for SharePoint to wait before requesting the next. Although crawl times will be significantly increased, server impact is negligible.

Crawler impact rules can also be quite helpful when you are crawling content that is located outside of SharePoint. Because you are a savvy SharePoint administrator, you have configured your SharePoint indexing server with a lot of muscle. This can’t necessarily be said for every server in your enterprise. Many a server has been brought to its knees when a multi-threaded indexing process is unleashed on an unsuspecting WEB server. In this case, use crawler impact rules to limit the number and frequency of the external server requests.

Queries and Results

The Queries and Results section in the Search Administration page’s quick launch lets you configure settings related to how Search queries are handled and how results are displayed. You can use this area to fine-tune your users’ search experience. Available settings include the following:

Authoritative pages — Administrators can specify which pages in the site are the most authoritative, or contain the most relevant information for which users are likely to be searching. You can specify as many authoritative pages as you want, and even assign pages as the most authoritative, second-most authoritative, and third-most authoritative. Likewise, you can even specify pages that should be demoted in search results. SharePoint’s search results are calculated using these pages, and the way it displays results are weighed against the pages’ content and the level of authoritativeness assigned to them.
Federated locations — A feature introduced to SharePoint 2007 with the Infrastructure Update was the capability to incorporate federated locations into SharePoint search results pages. This feature has been carried over to SharePoint 2010 and is available out of the box. Setting up federated locations enable a user’s query to be performed on multiple, alternative sources along with the standard SharePoint search index. Internet search engine results can be incorporated into the results, as well as databases and search scopes defined for SharePoint. This can give users richer results and provide them with more information than they would otherwise receive.

Search connectors can be downloaded from Microsoft and imported as a new location in the federated locations. In addition, you can specify triggers for when federated search content should display in the results, and set patterns and prefixes that narrow the results to more specific options. Numerous options and configurations are available when using federated search locations, a topic beyond the scope of this section.

Metadata properties — Nearly all content in SharePoint has some sort of metadata associated with it, which Search can use when ranking content for display on the results page. Metadata is also used when filtering search results (e.g., by document type or author). The metadata properties link in Search Administration shows the mapping properties set up for each bit of metadata. A mapped property is how SharePoint Search maps the available metadata fields to other metadata properties it knows about.
Scopes — Scopes provide a way for users to narrow their search results before even performing a query. Generally, when a search is performed, it is done against all the content in the index. While this is useful for broad searches, sometimes users may want to search only a specific site or set of data. Scopes can be set up to narrow the field of search down to a smaller subset of the entire search index. When you set up a scope, you first give it a name, then define the rules that will apply to the scope. A specific scope can use a different results page than a standard search.

When setting up the rules, you define the type rule that will be associated with the scope. A scope can be set up to search a specific web address (or addresses), a specific property, a content source you have defined, or all content. Various configurations are associated with each rule type, and rules can be used to include or exclude content, or to return a very narrow set of results by requiring the results to match the query exactly. In addition, more than one rule can be created for a scope. For instance, you could create a scope that queries only a specific site for documents created by a specific person. Or, perhaps you want to create a scope that searches all the content available except for one specific site. In that case, you would create a rule for that scope to include all content in the site, and a second rule to exclude the specific site from the results. As you can see, scopes can be set up to be as specific as you need them to be.

Every 15 minutes, the scopes are updated to include the results specified. Therefore, when you create a new scope or modify an existing scope, you may not have results right away, until the update is run. You can force an immediate update from the Search Administration page in the System Status section. There is a Start update now link next to Scopes needing update.

Search result removal — If you need to remove content from the search results immediately, then you have come to the right place. There are several reasons why you might want to remove content from your search results — perhaps some sensitive documents have been uploaded to a document library with incorrect permissions, or perhaps your company is starting work on a project to which only a few select people should have access. Although setting the proper permissions for sites will take care of the majority of issues that could occur with search accessing content it’s not supposed to, it’s possible that a user could accidentally set the wrong permissions for a site or document library, exposing the content to Search.

If this happens and you need to immediately remove the content from appearing in search results, simply enter the address of the site in the field on this page and click the Remove Now button. This will remove the specified URLs from the search results immediately, and add the addresses to the Crawl Rules to exclude that content from future crawls.

Note that while removing a document or correcting permissions issues will prevent users from accessing content, it will still show up in the search results until the next scheduled crawl. This is because the search results are returned from the index compiled from the last crawl, so real-time results are not displayed. Using the Result Removal tool is much faster than waiting for the next crawl to run or even starting a new crawl to update the index. The added benefit is that it also creates the crawl rule for you.

Reports

As an administrator, you may feel that you have a perfectly configured Search service application. Your content sources and scopes have been defined, your crawl schedules have been set, and life is good. However, your users may feel otherwise if they don’t seem to be getting the results they’re expecting. How would you know this? By checking out the available reports in this section, administrators can gain great insight into how Search is being used on the site, what users are searching for, and whether that content is actually being found.

Administration reports — This is actually a link to the Administrative Report Library, which can also be accessed from the Monitoring category in Central Administration. This library has several reports available out of the box relating to Search, which can be used to track the overall performance of Search and see how long it’s taking to crawl each content source, how fast it’s crawling each type result, and how long it takes for queries to return results. Clicking the report name generates a graph of information that you can use to check the performance of Search. Data is compiled over time, so you can select a date range to filter the data accordingly. Refer to Chapter 6 for more information about using the administrative reports.
Web Analytics reports — This link takes you to the real meat for finding out how users are using Search. The web analytics was enabled automatically for you if you used the Farm Configuration Wizard during the initial setup process. On this page, you can see the total number of queries performed on the site, as well as the average number of queries per day. Other reports available appear in the quick launch area, and give you a graphical representation of the number of queries over time (see Figure 14-24), as well as the Top Queries, and how often those queries were performed. You can also see what queries gave your users zero results with the No Queries Results report.

Figure 14-24

In the Ribbon on the Web Analytics Reports page, you can click the Analyze tab to refine the reports, change the dates, and even export the report to an Excel spreadsheet. These reports can be particularly useful when tracking Search trends over time. You can use these to set up best bets on the various sites to help users find the content they are looking for, or pass the reports on to site collection administrators or content owners to help them refine the content of their sites appropriately to help users find what they’re after as quickly as possible.

As you can see, there is a lot going on with the Search Administration page, and a lot of configuration options. Although all the various options can make setting up Search properly seem a little intimidating, we hope this overview has provided a strong base from which you can further explore this robust feature.

Other Search Features

A couple more Search features worth mentioning are Mobile Search and People Search:

Mobile Search — SharePoint 2010 has made some great strides forward with enhanced mobile support. One such advance is the addition of Mobile Search. From the mobile browser, the user can do a search and even choose a scope. Search results are displayed with a simplified interface, as shown in Figure 14-25. No graphics, no previews or suggestions, just what they came for — search results. If you want to see how this works from the comfort of your desktop PC, you can. From your Search site, click Site Settings. On the right side of the page, click the Mobile Site URL. Now have all the fun of a mobile experience without wearing out your thumbs.

Figure 14-25

People Search — People Search is covered in more detail in Chapter 17 but some highlights are worth mentioning here. Phonetic and nickname matching is very powerful. For example, search for the name fillups and you will get results for “Phillips.” Similarly, if you are looking for Jeff but cannot remember if it is Geoff or Jeff, no worries: Search for Jeff and get both. Looking for Bill will get you results for William as well. This is very powerful for enhancing discoverability.

Earlier we noted some “secret” improvements were made to relevancy. That goes for People Search as well. As a matter of fact, the people relevancy is so good in SharePoint 2010 that even when you use FAST Search for SharePoint, results for people searches come from SharePoint Search. FAST Search for SharePoint only indexes content — not people. When search queries include people results, content results from FAST and People results from SharePoint search are brought together in one unified set in the query object model on the query server, as shown in Figure 14-26.

Figure 14-26

When it comes to People Search, one of the most popular queries is searching for one’s own name. Search recognizes this type of query, returning the results in a special box, as shown in Figure 14-27. It indicates how many times people did a search that lead to you and what keywords they were searching when they found you. This insight can help you tune your My Site to make you easier to find (or harder if you are the shy type).

Figure 14-27

FAST Search

FAST Search for SharePoint Sites and FAST Search for Internet Sites take all of the goodness described in the previous section and give it a giant shot of adrenaline. SharePoint Search can view results in the browser using the OWAs? FAST Search can preview PowerPoint presentations within the actual search results. SharePoint Search has refiners for the first 50 documents? FAST does it for all documents in the results set. SharePoint Search can handle some 100 million items in the index? FAST is looking at closer to one billion items. You get the drift. Everything to the extreme.

This section offers a brief look at some of the key FAST differentiators. Because FAST was a bit late to the game for this version of SharePoint, documentation for it is still limited. As FAST matures, expect entire books dedicated to it.

Thumbnails

One of the first things you’ll notice when you do a FAST Search is the thumbnails that show up in the search results. This is useful if you are doing a search and can’t remember which document you are looking for just by seeing the title. A quick thumbnail of the documents helps to determine the right one.

Scrolling Preview

Clicking on the thumbnail of a PowerPoint document opens a scrolling preview of the slides, enabling you to determine whether it has the content you might be looking for. If you consider this on a larger scale, imagine if you were looking for information in a presentation but couldn’t remember exactly which slide deck had the specific file you were looking for. You could do a search and then open each slide deck one by one, which might take several minutes; or you could quickly take a look at the slides directly from the FAST Search results page, which would take only seconds to find the content you want.

Table of Contents for Chapter 14: Configuring and Managing Enterprise Search

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 14: Configuring and Managing Enterprise Search