Chapter 7. Getting Your Content to the User: Discovery, Indexing, and Search Results

Help Users Discover Your Content

Armed with a quick search box (Figure 7-1) and the Google Chrome browser, users with Google TV are one query away from visiting your site. As you crafted your site for the 10-foot experience, it’s smart to consider making a site that’s search friendly so you can capitalize on traffic from the 40%+ increase[15] in Internet searches in recent years.

As we’ve already touched on in previous chapters, Google TV users view search results (Figure 7-2) that are a blend of:

  • TV listings

  • Videos indexed by Google Video Search

  • Web search results (web pages, books, images, etc.)

This chapter focuses on helping you optimize your site for search, which ultimately will help users to find your web app and its content. We’ll start by reviewing some basic information about how search engines work, then we’ll cover some of the strategies and practices you can use for your own web app.

The Quick Search Box (QSB) on Google TV combines results from TV programs, videos, and web pages
Figure 7-1. The Quick Search Box (QSB) on Google TV combines results from TV programs, videos, and web pages
In addition to performing regular web searches, Google TV users can also view TV and video specific search results; shown here are the results for videos for Google TV at the Google I/O conference
Figure 7-2. In addition to performing regular web searches, Google TV users can also view TV and video specific search results; shown here are the results for videos for Google TV at the Google I/O conference

How Search Engines Work

Crawling, Indexing, Search Results

In a general sense, search engines have three main processes:

  1. Crawling (retrieving a web page)

  2. Indexing (making sense of the content of the page)

  3. Search results (ordering and displaying results in a relevant manner for the user)

To make a site optimized for TV-based searches, you should employ best practices at every stage of the search engine pipeline. These practices are similar to those used for desktop sites, but it’s worth reiterating them so that your TV web app is as search friendly as possible. We won’t delve into the technical intricacies of search engine optimization (SEO), but you can learn more about this topic with Google’s SEO resources for beginners listed at:

http://goo.gl/D8NFd

Please remember that the information we’re providing is specific to Google Search, although many of our recommendations also apply to other popular search engines.

Tip

The Googlebot is the name of Google’s crawler. It’s an automated process that fetches web content in compliance with the robots.txt specification. Please see “Controlling Crawling and Indexing,” hosted on http://code.google.com, for information on preventing your content from being crawled.

http://goo.gl/hg7u4

Components of an Individual Search Result

Search results, whether for videos or web pages, have similar components (e.g., title and description). For reference, here’s some of the terminology we’ll use throughout the chapter.

Several components of a web search result
Figure 7-3. Several components of a web search result

Site Architecture

Site architecture is the construction of your site, such as the directory structure and/or the internal linking schema.

Design a Logical Linking Structure

Here are some important considerations to keep in mind when designing an architecture helpful to both users and search engines:

  • Check that users are able to easily navigate from the home page to individual pages and back again

  • Verify that URLs are “shareable.” Important pages can be linked to and referenced from one TV user to another.

  • Avoid hiding your content from crawlers, such as making pages only accessible via a search box. Instead, internally link to content you want crawled and indexed by search engines.

  • Prevent restrictions on crawlers, such as requiring a login or cookie to view public content. Crawlers more easily find content through public links not blocked by forms or cookies.

To verify whether the crawler (Googlebot, in this case) detected your links, check out the Webmaster Tools “Internal links” feature for your verified site (Figure 7-4).

Google Webmaster Tools “Internal links” feature
Figure 7-4. Google Webmaster Tools “Internal links” feature

You can learn more about internal links on Webmaster Tools at: http://goo.gl/oyi7S

Tip

If you’re using Ajax-based navigation, be sure to include capability for your users to share URLs and use back/forward buttons. Google supports the Ajax Crawling Scheme to help your Ajax site to be better crawled and indexed: http://goo.gl/ceFQT

Use Descriptive Anchor Text

Anchor text, the clickable words in a link, is a signal to search engines and users about the content of the target URL. The more search engines understand about your pages, such as the content, title, and in-bound anchor text, the more relevant information can be returned to searchers. Descriptive anchor text avoids phrases like “click here”:

To view more cute kitten videos <a href="cute-kitten-videos.html">click here</a>.

And instead contains relevant keywords such as “cute kitten videos”:

Feel free to browse our <a href="cute-kitten-videos.html">cute kitten videos</a>.

URL Structure

URL structure is important because in Google search results, the URL of a document is displayed to the user below the document’s title and snippet. URLs that contain relevant keywords provide searchers with more information about the result—often in resulting in higher click-through. Additionally, for search engines, keywords in the URL can be used as a signal in ranking.

Include Keywords in the URL, If Possible

It’s helpful for users to see their query terms reinforced in the search result. If the user queries [google webmaster blog], it’s obvious the keywords “google,” “webmaster,” and, “blog” help signal to the user that the result is relevant.

Here are helpful URLs:

Not as helpful:

Note that keywords in the URL that match the user’s query are highlighted in the search result (Figure 7-5). Keywords are more descriptive than cryptic numbers and letters, which can go unnoticed in results (Figure 7-6).

Query terms are highlighted in the URL—helpful to searchers
Figure 7-5. Query terms are highlighted in the URL—helpful to searchers
Cryptic filenames are less descriptive for searchers
Figure 7-6. Cryptic filenames are less descriptive for searchers

Select the Right URL Structure for Your TV Site

When designing for TV, there are two general options for your URL structure:

  1. Keep URL structure and site architecture the same in your TV and desktop versions. For example:

    Desktop and TV users both browse http://www.example.com/article1

  2. Create new URLs for the TV version. This can be accomplished using relevant subfolders:

    Desktop users browse http://www.example.com/article1

    TV users browse http://www.example.com/tv/article1

    Or with subdomains:

    TV users browse http://tv.example.com/article1

Note that Google recommends the second option. Note that having multiple URLs for one piece of content (e.g., one URL for desktop users, one URL for TV users) will not cause duplicate content issues if rel="canonical" is implemented (see Duplicate Content: Side Effects and Options for more on the canonical attribute).

Learn the Facts About Dynamic URLs

If your site uses dynamic URLs, Google provides a few pointers:

  • Use name/value pairs such as item=car&type=sedan

  • Be careful with URL rewriting—it’s not uncommon for a developer to incorrectly implement URL rewrites, causing crawling and indexing issues for search engines

  • Verify ownership of your site in Google Webmaster Tools and utilize the URL parameter handling feature to help Google crawl your site more efficiently (Figure 7-7).

For sites with dynamic URLs, Google Webmaster Tools’ “parameter handling” allows the developer to specify to Googlebot which parameters to ignore when crawling
Figure 7-7. For sites with dynamic URLs, Google Webmaster Tools’ “parameter handling” allows the developer to specify to Googlebot which parameters to ignore when crawling

On-Page Optimizations

In addition to site architecture and URL structure, there are on-page optimizations which can improve your performance in search. For example, the first thing a user sees in search results is likely your page’s title and a snippet. In many cases, you have some control over what is displayed. The key things to consider are:

  • Are my page titles informative?

  • Are my descriptions informative and compelling for the user?

  • If I’m showing a video result, is the thumbnail and information about the video as accurate as possible?

Create Unique Titles Reflective of the Page’s Content

<titles> are used as the first line of each search result. Using descriptive words and phrases in your page’s title tag helps both users and search engines better understand the focus of the page (Figure 7-8 and Figure 7-9).

“Untitled” isn’t a descriptive title
Figure 7-8. “Untitled” isn’t a descriptive title
Descriptive titles help searchers
Figure 7-9. Descriptive titles help searchers

Include Unique Meta Descriptions for Each Page

Google often displays the description meta tag as the snippet of the search result. In other words, if it’s relevant to the query, the meta description you create can be visible to the user. Similar to the <title> tag, the description meta tag is placed within the <head> tag of your HTML document. Whereas a page’s title may be a few words or a phrase, a page’s meta description may include several sentences.

Each page should have a unique description reflective of the content. Avoid “keyword stuffing” the description (e.g. <meta name="description" content="best video brad pitt tom cruise george clooney cute kitten three wolves shirt" />).

Google Webmaster Tools provides an “HTML Suggestions” section that provides information about titles and meta description that are either too short, long, or are duplicates (Figure 7-10).

Note that the <meta keywords> tag is not used as a signal to Google.

Webmaster Tools’ “HTML suggestions” feature provides information on pages with sub-optimal titles and meta descriptions
Figure 7-10. Webmaster Tools’ “HTML suggestions” feature provides information on pages with sub-optimal titles and meta descriptions

Duplicate Content: Side Effects and Options

It’s likely that to properly serve users on different devices, you’ve created multiple URLs containing the same content. For example, these URLs may point to pages with the same (or extremely similar) main content but with a slightly different display or interaction:

In common search optimization (SEO) lingo, the same content available on different URLs is known as “duplicate content,” an undesirable scenario. Although search engines already attempt to address duplicate content issues on their own, if you’d like to be more proactive, here are some steps to limit or reduce duplicate content:

  1. Choose a version from the duplicate URLs as the canonical. This is likely the cleanest, most user-friendly version.

  2. Be consistent with the canonical URL. Internal links should use this version, not any of the duplicates. Also, sitemaps submitted should only contain the canonical and exclude the duplicates.

  3. On the duplicate URL, you may wish to include rel="canonical”, listing the URL you’d prefer to appear in search results (i.e. the canonical).

    More information on duplicate content and rel="canonical" can be found at:

    http://goo.gl/kvfsz

Tip

Google recommends that you do not robots.txt disallow the duplicate version of your content. If crawling is disallowed, Google cannot obtain a copy of the document, and the rel="canonical" hint will remain undetected.

Serving the Right Version to Your Users

Regardless if their device is a TV, desktop, or mobile phone, you want every visitor to your site to have the best possible experience. For instance, when a Google TV user clicks this URL in search results:

http://www.example.com/article1

(which is both the canonical version and the desktop version), instead of serving this desktop URL, serve the appropriate TV-based app at:

http://www.example.com/tv/article1

or

http://tv.example.com/article1

As discussed in Chapter 4, the User-Agent string can be used to detect whether your visitor comes from a Chrome browser on Google TV.

Working with Video: King of Content for TV

Much of this chapter has presented you with a number of ideas and approaches for producing and managing your content to maximize your site for search. Video content is one of the most popular rich media formats in the world, and every day, millions of people around the world access cool and engaging videos from a variety of sources. But with all of the content that’s out there, how can you make sure that your videos are discovered by users? The first step in helping your viewers find that content is to have the content indexed.

Feeds

Crawling rich media content, such as videos, is difficult. You can complement this crawling process, ensuring that Google knows about all of your rich media content, by using a sitemap or media RSS (mRSS) feed. A Google Video Sitemap or mRSS feed enables you to provide descriptive information about your video content that can be indexed by Google’s search engine. This metadata, such as a video’s title, description, and duration, may be used in search results, thereby making it easier for users to find particular content.

Tip

Media RSS, or mRSS, is an extension to RSS that is used to syndicate various types of multimedia, including audio, video, and images.

The Google Video Sitemap is an extension of the sitemap protocol. This protocol enables you to publish and syndicate online video content (and its relevant metadata) in order to make it searchable in a content-specific index known as the Google Video index. When Google’s indexing servers become aware of a video sitemap, usually through submission via the Webmaster Tools, the sitemap is used to crawl your website and identify your videos.

Tip

A Google Video Sitemap is simply a link to each video landing page, along with some additional information, such as title, description, thumbnail, and duration, which can be displayed in the search results.

Before we dive into some key elements of the video feed, it’s important to note that the video feed needs to be optimized for the search engine pipeline of crawling, indexing, and ranking. By including all of the video content on a site in a Google Video Sitemap, and subsequently submitting the video sitemap via the Webmaster Console, we can speed up the crawling process. As we add the necessary metadata for each of the videos, we’re providing information for ranking, and writing content that will tell the user about the video in the results page.

Here is an example entry of a Google Video Sitemap for a page that included video:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"> 
   <url> 
     <loc>http://www.example.com/videos/some_video_landing_page.html</loc>
     <video:video>
       <video:thumbnail_loc>http://www.example.com/thumbs/123.jpg
       </video:thumbnail_loc> 
       <video:title>Grilling steaks for summer</video:title>
       <video:description>Alkis shows you how to get perfectly done steaks every            
         time</video:description>
       <video:content_loc>http://www.example.com/video123.flv</video:content_loc>
       <video:player_loc allow_embed="yes" autoplay="ap=1">
         http://www.example.com/videoplayer.swf?video=123</video:player_loc>
       <video:duration>600</video:duration>
       <video:expiration_date>2009-11-05T19:20:30+08:00</video:expiration_date>
       <video:rating>4.2</video:rating> 
       <video:view_count>12345</video:view_count>    
       <video:publication_date>2007-11-05T19:20:30+08:00</video:publication_date>
       <video:tag>steak</video:tag> 
       <video:tag>meat</video:tag> 
       <video:tag>summer</video:tag> 
       <video:category>Grilling</video:category>
       <video:family_friendly>yes</video:family_friendly>   
       <video:restriction relationship="allow">IE GB US CA</video:restriction> 
       <video:gallery_loc title="Cooking Videos">http://cooking.example.com
       </video:gallery_loc>
       <video:price currency="EUR">1.99</video:price>
       <video:requires_subscription>yes</video:requires_subscription>
       <video:uploader info="http://www.example.com/users/userA">userA
         </video:uploader>
     </video:video> 
   </url> 
</urlset>

Required Tags

There are a number of key elements that should be included for each video:

ElementDescriptionIncluded in search results?
<loc>The play page URL where users can watch the videoIncluded in the search results
<video:thumbnail_loc>URL pointing to thumbnail image file to represent your video in search results. Most image sizes and types are accepted, but it is recommended that your thumbs be at least 160 × 120 pixels in .jpg, .png, or .gif formats.May be included in the search results
<video: title>Contains the title of the video and is limited to 100 characters.May be included in the search results
<video:description>Contains the description of the video and is limited to 2048 characters (longer descriptions will be truncated.)May be included in the search results
<video:content_loc>The URL should point to a .mpg, .mpeg, .mp4, .m4v, .mov, .wmv, .asf, .avi, .ra, .ram, .rm, .flv, or other video file format, and can be omitted if <video:player_loc> is specified. However, because Google needs to be able to check that the Flash object is actually a player for video (as opposed to some other use of Flash, e.g. games and animations), it’s helpful to provide both.NOT included in the search results
<video:player_loc>

A URL pointing to a Flash player for a specific video. In general, this is the information in the src element of an <embed> tag and should not be the same as the content of the <loc> tag. ​Since each video is uniquely identified by its content URL (the location of the actual video file) or, if a content URL is not present, a player URL (a URL pointing to a player for the video), you must include either the <video:player_loc> or <video:content_loc> tags. If these tags are omitted and we can’t find this information, we’ll be unable to index your video.

The optional attribute allow_embed specifies whether Google can embed the video in search results. Allowed values are Yes or No.

The optional attribute autoplay has a user-defined string (in the previous example, ap = 1) that Google may append (if appropriate) to the flashvars parameter to enable autoplay of the video. For example: <embed src="http://www.example.com/videoplayer.swf?video=123" autoplay="ap=1"/>.

Examples:

  • YouTube: http://www.youtube.com/v/v65Ud3VqChY

  • Dailymotion: http://www.dailymotion.com/swf/x1o2g

NOT included in the search results

Tip

Help ensure that only Googlebot accesses your content by using a reverse DNS lookup (http://www.google.com/support/webmasters/bin/answer.py?answer=80553).

When you submit a Google Video Sitemap via the Webmaster Tools, you start the search engine pipeline. Each video item will be parsed, identified, and the corresponding content page will be fetched. After each page is fetched, a validation process takes place to ensure that the data in the Google Video Sitemap feed matches that on the play page and that the page contains a video. If validation is successful, the feed data may either be inserted into the index (if this is a new page) or an existing page may be updated.

Warning

While the video sitemap may make it easier for the Googlebot to find content that it would not otherwise discover, it does not guarantee that all videos included in the sitemap will appear in the search results.

Optional Tags

In addition to optimizing the search engine pipeline process and the user experience on your site, you should pay attention to the user experience on a search engine results page. Remember that time when you did a search to find that long lost video about a dog surfing? Then when you got the search results back, you clicked on a link, only to find a “Sorry this video is not available” message.

As the content provider and submitter of the video feed, you can prevent some of that poor user experience by using some optional tags in Google Video Sitemaps.

<video:expiration_date>

If you have content that expires, you can submit this tag with the date after which the video will no longer be available, in W3C format. Acceptable values are complete date (YYYY-MM-DD) and complete date plus hours, minutes and seconds, and timezone (YYYY-MM-DDThh:mm:ss+TZD). For example, 2007-07-16T19:20:30+08:00.

Is is recommended that you not supply this information if your video does not expire.

<video:publication_date>

A complementary tag that can be used which can help in the Google Video Sitemap management. For example, if publish your video sitemap periodically, and content will not be available until some time in between your sitemap updates, you can use this tag to tell Google to index the video, but not show it in search results until after this date.

Acceptable values are complete date (YYYY-MM-DD) and complete date plus hours, minutes and seconds, and timezone (YYYY-MM-DDThh:mm:ss+TZD). For example, 2007-07-16T19:20:30+08:00.

<video:duration>

Contains the duration of the video in seconds. This will be presented in the search results, and can be used by the user to filter results by video length.

Value must be between 0 and 28800 (8 hours). Only digit characters are allowed.

In addition to limiting access to content temporally, access may be restricted based on geographic location. For example, video content produced in the UK may only be viewable to users in the UK. Thus, you would not want someone in Japan to see the page in her search results. This can be managed using the <video:restriction> tag.

<video:restriction>A list of countries where the video may or may not be played, in space-delimited ISO 3166 format. The required attribute "relationship" specifies whether the video is restricted or permitted for the specified countries. Allowed values are allow or deny. Only one <video:restriction> tag can appear for each video. If there is no <video:restriction> tag, it is assumed that the video can be played in all territories.

Other Feeds/Options

Video Sitemap or mRSS

If you’re further wondering about the benefits of specific feeds (video sitemaps versus mRSS), we can provide some clarification. First, you should note that you can use either. Neither format gets priority or precedence over the other. However, one benefit of video sitemaps is that the format can quickly be extended to allow for more specifications, as Google maintains the format.

Tip

If you’re going to start from scratch, video sitemaps is the recommended approach.

Facebook Share and RDFs

You can expand the metadata about content on your pages with the use of markup tags in the body of the web page. Google recognizes two video markup formats: Facebook Share and Yahoo! SearchMonkey RDFa. Using either (or both) of these formats to mark up video directly in your HTML enables the Googlebot to better understand and present video content. Be sure that this additional markup appears in the HTML without the execution of JavaScript or Flash as otherwise the information will not be discoverable.

TV Show Tags

If your site contains episodic content (like television shows), you can provide Google additional information about those videos to further enhance the user’s search experience. Figure 7-11 demonstrates this experience and the results for the first season of the TV show “House.”

Screenshot example with episodic content
Figure 7-11. Screenshot example with episodic content

You can see that Google presents all of the House Season 1 episodes that have been indexed. In the navigation pane on the left, you can easily select other seasons and look at those results as well.

Adding the TV show metadata is accomplished by adding a <video:tvshow> tag along with the relevant children tags as demonstrated below:

<video:video>
  <video:title>My Super Show, Season 1, Episode 2</video:title>
  ... other required root level video tags ... 
  <video:tvshow> 
    <video:show_title>The Super Show</video:show_title>
    <video:video_type>full</video:video_type> 
    <video:episode_title>The Best Show Ever</video:episode_title> 
    <video:season_number>1</video:season_number> 
    <video:episode_number>3</video:episode_number> 
  </video:tvshow> 
</video:video>

Full details for the TV show tags (for both Google Video Sitemaps and Bing mRSS feed) can be found at http://goo.gl/Gi2nW.

What’s Next?

Now that you have a full, end-to-end understanding of the platform and the technical skills needed to build web apps for Google TV, it’s time for you to put your knowledge into action. Think about the types of experiences that you want to deliver on the big screen and start building them.

We’ve touched on the fundamental techniques and skills needed to build web apps, but we’re counting on you to innovate and deliver compelling experiences that truly transform TV. Once you’ve built your web app, spread the word about it and help users discover what you created. Aside from ensuring that your site is easily searched and indexed, be sure to let other developers know about your web app in the Google TV web forum (http://goo.gl/pR9UB) and submit your app to the Google TV web app gallery (http://gtv-gallery.appspot.com/).

Onward and upward!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset