17
Data Import and Measurement Protocol

Thus far in the book, the discussions about Google Analytics (GA) data capture have centered around the GA JavaScript tracking code, analytics.js. Whether placed on the page directly or deployed through Google Tag Manager as we have demonstrated and recommended, tracking code execution tends to be our primary mental paradigm for GA data capture. In Chapter 14, we also worked with the Android and iOS SDKs to record mobile app data into GA.

In this chapter, we look at the two options beyond analytics.js and the SDKs for getting data into GA: data import (through the GA user interface or through the API), and the Measurement Protocol (MP), which allows any programmed, networked environment to push hits into GA in the form of basic HTTP requests.

We’ll use data import in most cases to add dimensions to analytics.js and mobile SDK hits; using the Measurement Protocol, we’ll record new hits.

Data Import

The data import feature in GA may seem somewhat complicated at first, but it’s actually quite straightforward and very flexible. In each case, you’re generating a .csv (comma-separated values) data file and importing built-in or custom dimensions against a common key in GA.

Several data import scenarios are outlined next.

Importing CRM Data into Google Analytics

In Chapter 16, “Integrating Google Analytics with CRM Data,” we recorded GA campaign information directly in our CRM (Salesforce) so we could correlate our medium, source, and campaign values with the lead status that our sales team assigned in the CRM. We also recorded a common visitor ID in both GA and our CRM to enable additional joins between the two data sets within the CRM, within a separate environment, or within GA.

We can take advantage of the GA data import feature to accomplish the join within GA. (Since the imported data would need to apply retroactively, this scenario would require query-time import available in Analytics 360. The content, product, and geo imports that appear later in the chapter would be suitable as processing-time or query-time import and would therefore also be applicable to GA Standard.) In this example, we’ll take the following steps to import lead status from the CRM for further correlation with GA data. Note that you need Edit access at the property level to perform the following procedure:

  1. Create a Lead Status custom dimension in GA.

    The data that we’re pulling in from Salesforce is the qualification status that our sales team assigned to each lead. Since there is not currently a slot for this data in GA, we’ll need to create a custom dimension as in Figure 17.1 to populate as the objective of the data import.

    Screenshot shows Add Custom Dimension field in which lead status is entered as the name, user is chosen as the scope and checkbox for active is marked. Create and Cancel buttons are shown on bottom.

    Figure 17.1 Creating a custom dimension to receive the lead status value during data import.

    Let’s say that we had previously created five other custom dimensions in the GA property and that this is the sixth. When we’re importing the CRM data into GA, we’ll add ga:dimension6 as a heading to the lead status column to match it to the lead status custom dimension.

  2. Create the data set schema in GA.

    Before we can import our CRM data into GA, we need to create a data set to receive the CRM data and define a schema to map the data import. In most cases, you’ll designate a single key field that will serve as the join between the data sets and one or more target fields to populate in GA from the imported data.

    1. In the Property Admin, click Data Import.
    2. For Import Behavior, select Query Time.
    3. For Data Set Type, select User, and click Next Step.
    4. On the following screen, name the data set Lead Status, and select one or more views to import the customer relationship management (CRM) data into. (You’re advised to import first into a test view, verify import in the test view, and then import into one or more working views.)
    5. In the next step, designate the common field/dimension to be used as the join in the import and, under Imported Data, one or more target dimensions to copy the new data into, as shown in Figures 17.2 and 17.3.

      Screenshot shows Data set schema in which Visitor ID is chosen as the Key and Lead Status is chosen as the Imported data. No refinement and regex refinement options are also shown in Key section.

      Figure 17.2 For the data set schema, you’ll designate a key, which normally consists of a single dimension, such as the visitor ID that you created as a custom dimension, as well as one or more target fields to import against the key.

      Screenshot shows Data set schema in which the Key is specified as User ID and Lead Status is chosen as the Imported data. No refinement and regex refinement options are also shown in Key section.

      Figure 17.3 This schema is similar to Figure 17.2, but as the key, we’re using the User ID dimension that you can populate for cross-device tracking but that can also serve to join your CRM and GA data.

    6. For Overwrite Hit Data, you can select No, since there will be no lead status associated with any of the visitor IDs at this point. For future imports in which Lead Status may have changed for some visitorIDs, you can select Yes for Overwrite Hit Data.
    7. Click Get Schema to confirm the heading names that we’ll need to add to our exported CRM data in step 3. In the lead status example, you’ll configure the schema as either ga:dimension4, ga:dimension6, or ga:userId, ga:dimension6, depending on where you stored your visitor ID from the CRM (and assuming that you created the lead status custom field in the sixth custom dimension slot as in the example). If you previously stored the CRM visitor ID as both the User ID (for cross-device) and your Visitor ID custom dimension for CRM integration, you can use either as the key in your schema.

  3. Export CRM data.

    In Salesforce, as an example, we can create a report that contains one of two field sets, following the schema in either Figure 17.2 or 17.3.

    • Visitor ID and lead status. Again, if the common key that we stored in both GA and Salesforce was a client-generated ID (such as the client ID value of the GA _ga cookie), you need to export the visitor ID custom field that you populated into the Salesforce lead through a hidden field on the memory chip lead form.
    • Lead ID/Customer ID and lead status. If, instead of passing a client-generated visitor ID to your CRM when a lead or purchase form was initially submitted, you retrieved the lead or customer ID from your new CRM record and recorded it in GA as the visitor ID custom dimension or the cross-device user ID, export this lead/customer ID and lead status.

      When you export your CRM report as comma-separated values (CSV), you’ll generate a file similar to Listing 17.1. For our example, we’ll call this file sf-export-lead-status.csv.

      Note that you must add the heading row yourself. This schema example corresponds to Figure 17.2.

  4. Upload the .csv file.
    1. In the property admin column, click Data Import.
    2. For the Lead Status data set listed in Figure 17.4 that we defined previously, click Manage uploads.
    3. Click Upload file, select sf-export-lead-status.csv, and click Upload.
Screenshot shows a row containing name as lead status, corresponding type as user data and action as manage uploads. NEW DATA SET button and a search field are shown on the top.

Figure 17.4 You import the CRM data into the data set defined in step 2.

Once the import is completed, you’ll see a confirmation, as in Figure 17.5.

Screenshot shows a row containing upload date, filename, status of completion and checkbox for Delete. Back to Data Import, Upload file and Refresh options are shown on the top.

Figure 17.5 Data import confirmation for the lead status data set.

Uploading through the Management API

Note that you can also import data through the GA Management API instead of through the GA user interface. The management API would be a better option in scenarios where automation is desired. If data is being uploaded to GA on an ad hoc or infrequent basis, then the interface method works perfectly. If the upload is frequent or if a hands-off approach is desired, then the capabilities facilitated by the management API become much more attractive.

Using Imported Data in Google Analytics Reports

In many cases, such as the lead status example outlined in the previous section, the new data is imported into one or more custom dimensions. While custom dimensions do not appear by default in any of the built-in GA reports, you can apply custom dimensions as secondary dimensions in the built-in reports and use custom dimension to define a custom segment or custom report as discussed in Chapter 10, “Segments,” and Chapter 11, “Dashboards, Custom Reports, and Intelligence Alerts.”

We could apply our imported Lead Status data as a custom dimension in the Campaigns report as displayed in Figure 17.6. This report displays some of the same data as Table 15.4, but here the merge between CRM and GA data has occurred within GA and not the CRM or a separate environment.

Screenshot shows lead status as the secondary dimension and default as sort type. It shows a table listing lead status and number of sessions of eight campaign sources.

Figure 17.6 Lead status applied as a secondary dimension in the Campaigns report.

Importing Content Data into Google Analytics

It was in Chapter 12, “Implementation Customizations,” that we first discussed the classic examples of content dimensions that GA can’t record without help from you: author and category on blog or article pages. GA has no way to capture this data by default, so we populated the data layer with author and category variables from your content management system (CMS) since they did not already appear on the page, and we recorded author and category as custom dimensions with the pageview hit.

If, for any reason, your developers were not in a position to write that back-end data to the data layer so you could record it as a custom dimension with each hit, you could, instead, perform a data import for author and category against the page.

As mentioned above, GA Standard supports only processing-time imports other than Ecommerce refunds and cost data, so it would be beneficial to import this data as early as possible if you’re using GA Standard. (With Analytics 360, you can take advantage of query-time import to add data to hits that have already been processed.)

The procedure for importing content data is similar to importing user data as seen in the previous CRM example, but instead of matching on a visitor ID custom dimension or user ID, we’ll match our content on the Page dimension, or a portion of it.

To perform the author and category import, take the following steps:

  1. Create two new custom dimensions, one for author and one for category, with the scope set to Hit as shown in Figure 17.7.
  2. Analyze the CMS key value relative to the GA Page dimension.

    Before we create a data set and associated schema in GA, let’s consider the CMS export, especially the key, and how the key is represented in the page URL.

    Your CMS is essentially a database that injects content into the page. Each page on your website exists as a record in your CMS. When a Web visitor requests a page on your website, the CMS (1) reads the page key from the URL, (2) pulls the corresponding fields from the CMS into a page template, and (3) sends the resulting HTML for the page to the requesting browser.

    As a simple example, let’s say that three URLs in the news section of your meteorology website appear in the following format:

    /news/article.php?articleId=3293
    /news/article.php?articleId=4588
    /news/article.php?articleId=5214

    The corresponding row in your CMS might be structured as in Table 17.1, with the articleId parameter in the URL serving as the unique key in the CMS.

  3. Define the data set and associated schema in GA.
    1. In the Property column of the Admin screen, click Data Import.
    2. For Data set type, select Content Data.
    3. Select one or more views to import the data into.
    4. For Key, select Page. In many cases, the key in the imported data set will correspond only to a portion of the Page value rather the full Page value as detailed in Table 17.2.

    5. Under Imported Data, set the newly defined Author and Category custom dimensions as the import targets. In this case, Author and Category were the fourth and fifth custom dimensions that you defined in the property, so you’ll specify ga:dimension4 and ga:dimension5 in the header row of the .csv import file.
    6. Keep the overwrite option set to No unless you’re performing an additional import and any previous author or category values that may have changed in the interim.

      Since the key values previously listed in Table 17.1 correspond only with the value of the articleId query parameter in the Page dimension, we can specify articleId as the query refinement. This refinement applies only when your CMS key appears as the value in a name=value pair. The key in the data set configured in Figure 17.8 uses a query refinement on articleId.

      If, instead, your CMS records were identified with a text key such as typhoon, and that text was incorporated into the Page value but not with the name=value format, you could instead use regex refinement to isolate the key within the URL pattern. In the regex refinement example in Table 17.2, /news/ identifies the static pattern of Page value, and ([^/]+) represents the dynamic portion of the Page value that corresponds to the CMS key.

      You could also use a regex refinement to match a single CMS key to multiple page variations, such as /news/article.php?articleId=5214 and /news/article.php?articleId=5214&sessionId=12374, but, ideally, you should instead make every effort to consolidate your Page values by using the Exclude URL Query Parameters view setting as discussed in Chapter 9, “View Settings, View Filters, and Access Rights.”

  4. Export the article ID, author, and category columns from your CMS, and add a header row to match the schema.
  5. Upload the .csv file shown in Listing 17.2 as in the lead status example above.

This data import populates author and category just as if we were recording custom dimensions with each pageview on the meteorology website, and we can now create a custom report to display performance by author and category.

Image described by surrounding text and caption.

Figure 17.7 When you create the Author and Category custom dimensions, specify the scope as Hit.

Table 17.1 Sample Content Management System (CMS) Records

Article ID Title Meta Description Main Content Author Category
3293 Winter Outlook 2017 Long-range forecast for winter 2017 It appears that winter in the northern hemisphere … Andrew Cullen Forecasts
4588 Worldwide Water Update Comprehensive water study As we analyze hydrological data from around the world … Stacy Hamida Hydrology
5214 Typhoon Watch Latest tracking for Pacific cyclones This typhoon season in the Pacific is proving to be very active … Andrew Cullen Cyclones

Table 17.2 Key Refinements for Page Dimension

Page dimension in GA /news/article.php?articleId=5214
Key in CMS 5214
Refinement query refinement: articleId
Page dimension in GA /news/typhoon
Key in CMS typhoon
Refinement regex refinement: /news/([^/]+)

Importing Campaign Data into Google Analytics

In Chapter 7, “Acquisition Reports,” we discussed the extreme importance of using the utm_medium, utm_source, and utm_campaign URL parameters to more accurately record our traffic sources and to populate the campaigns report with accurate, structured campaign data as in Listing 17.3.

There is, however, another option for populating campaign parameters into GA: you can pass a single utm_id parameter in the URL, as shown in Listing 17.4, and then import the campaign parameters—and custom dimensions as an option—against this ID as the key, as shown in Listing 17.4.

Why might you use the simplified campaign parameter format in Listing 17.4? For one thing, it can be a bit compromising to explicitly display campaign parameters for your website visitors to plainly see. In most instances, direct indications of marketing descriptions can only be a distraction from the user and brand experience that you aim to provide. In other cases, you might be dealing with advertising platforms that allow a single campaign ID only.

To import campaign data:

  1. Create a new data set and select Campaign Data as the Data set type.
  2. Define the schema as in Figure 17.9. Note that we have created a Campaign Group custom dimension, which you can use (in a custom report or segment, or as a secondary dimension) to distinguish between product campaigns and informational resource campaigns. Also, as mentioned in Chapter 7, campaign name is not actually mandatory for campaign tracking, but it’s always recommended as best practice.

  3. Populate a .csv file as in Table 17.3.

  4. Import the .csv file as in the lead status and author/category examples earlier in the chapter.

Table 17.3 Representation of the Key, Campaign Parameters, and a Custom Dimension to Match the Schema in Figure 17.9

ga:campaignCode ga:medium ga:source ga:campaign ga:dimension8
198 email asia-list 20160701-newsletter info-resource
199 qa-code catalog 20160705-spring-discount-code-qa product
200 email europe-list 20160708-newsletter info-resource
201 social facebook 20160709-spring-discount-code-fb product
202 social linkedin 20160710-spring-discount-code-li product
Screenshot shows Data set schema in which page is chosen as the Key, query refinement button is clicked, article ID is specified as the query parameter and author and category are chosen as the Imported data.

Figure 17.8 In the schema for the CMS data import, a query refinement is applied to match only the articleId value in the Page dimension.

Screenshot shows Data set schema in which campaign code is chosen as the Key, medium, source, campaign and campaign group are chosen as the Imported data.

Figure 17.9 With utm_id populated as the ga:campaignCode key dimension, you can import campaign medium, source, and name, and also custom dimensions such as campaign group.

Importing Cost Data into Google Analytics

By linking your Google AdWords and GA and enabling AdWords auto-tagging as described in Chapter 14, “Google Analytics Integrations—The Power of Together,” you can view AdWords cost data within GA. While GA does not offer direct integrations with other paid advertising, you can manually import cost data from other platforms such as Bing or Facebook.

Importing cost data is fairly similar to the data imports we reviewed previously in this chapter, but a few special considerations are detailed in the following procedure.

  1. Create a new data set and select Cost Data as the Data set type.
  2. Define the schema as in Figure 17.10.
    • Note the key is automatically defined as the combination of medium and source dimensions. In most cases, however, campaign will serve as a more specific de facto key as described in the figure caption.
    • You must import one of the following three metrics: impressions, clicks, and cost. In many cases, such as Bing Ads, it makes sense to import all three. You can use this schema, however, even if the data you’re importing does not contain all three values.
  3. Export a .csv file from the ad platform as shown for Bing in Figure 17.11.
    • Although the date does not appear in the schema, you’ll need to make sure that ga:date appears as the first column in the .csv before import. Each date must appear in YYYYMMDD format.
    • You can manually add the ga:medium and ga:source columns to the export, with respective values cpc and bing for each row.
    • The Search Term column in the AdWords export corresponds to the Matched Search Query dimension in GA.
  4. Import the .csv file as in the previous examples.
Screenshot shows Data set details, Data set schema in which source and medium are chosen as the key and campaign, keyword and matched search query are chosen as the Imported data.

Figure 17.10 Schema for cost data import. Campaign is among the nonrequired import values, but for many cost data imports, it will serve as the de facto key.

Screenshot shows unit of time as day and format as CSV in general settings and available columns and selected columns in Attributes tab of Choose your columns section.

Figure 17.11 Campaign cost data export from Bing Ads.

Comparing Campaign Cost and Performance

Once you import campaign cost data, you’ll be able to access the Campaigns ˃ Cost Analysis report, as shown in Figure 17.12.

Screenshot shows a table listing source/medium, number of sessions, cost and cost per conversions of dining room, carper liquidation, upholstered chairs, no-stain guarantee and bamboo coffee table.

Figure 17.12 Cost Analysis report for imported paid campaign cost data.

Notice, however, two types of data that the Cost Analysis report does not include:

  • Postclick performance data, such as bounces and conversions, does not appear.
  • While AdWords campaign data does also appear in the general All Campaigns report, AdWords cost data does not appear in the Cost Analysis report.

To incorporate cost and performance data for AdWords and non-AdWords campaigns into a single report, we can easily configure a custom report as shown in Figure 17.13.

Screenshot shows title as Integrated Campaign Performance in General information section, name, type, metric groups and dimension drilldowns in Report content section and an optional filter.

Figure 17.13 This custom report configuration integrates performance and cost data for AdWords and non-AdWords campaigns.

Importing Product Data into Google Analytics

Product data import is quite straightforward and allows you to add or overwrite many Ecommerce and Enhanced Ecommerce dimensions, as well as the price metric, using product SKU as the key.

Product SKU or name is, in fact, the only product detail that you are required to provide when you’re coding your actual Enhanced Ecommerce interactions on your website; you could potentially provide SKU only at transaction time and match to imported Ecommerce dimensions.

In addition to the dedicated Ecommerce dimensions, you can import custom dimensions—such as size, color, warranty level, or any other product or service descriptor—per SKU, as the custom report in Figure 17.14 illustrates. Keep in mind that you could also populate any of these built-in or custom dimensions while the user is interacting with your Web pages or mobile app; the product data import is a different option for associating this extra data with each SKU.

Screenshot shows a five-row table listing product, color, transactions and product revenue.

Figure 17.14 This custom report breaks down performance by the built-in Product dimension and the imported Color dimension.

Importing Geo Data into Google Analytics

In addition to the geographic hierarchy GA offers by default, you can create your own geographic divisions by importing against city, region (which usually corresponds to province or state), country, or subcontinent.

For instance, you could create a custom dimension named US Regional and import a value such as Northeast, Southeast, and so on against the built-in Region ID dimension value for each U.S. state and then report performance by these imported dimension values, as in Figure 17.15.

Screenshot shows a table listing number of sessions, bounce rate and lead submit-goal 1 completions of mountain, west coast, Midwest, southeast and northeast US regions.

Figure 17.15 Custom report showing goal conversion by imported US Regional custom dimension.

Measurement Protocol

When Google Universal Analytics was first released and began to be adopted, we as GA users observed that most of GAs functionality remained the same. There were a few additional admin settings, and the syntax for native tracking had changed to a simpler format, but not too much else had changed in terms day-to-day reporting.

Several of the new capabilities are what makes Universal universal: cross-device tracking, custom metrics and additional custom dimensions, and, perhaps more than any other feature, the MP. The MP is arguably the most universal part of Universal Analytics in that it allows you to send Hypertext Transfer Protocol (HTTP) requests from any programmed, networked device or environment to record data into GA.

Measurement Protocol is a much more specialized usage of GA, but it’s important to know that this option exists. Scenarios for MP usage include mobile apps designed for Windows, Blackberry, or other mobile operating systems for which GA SDKs are not available. In the following sections, Hazem Mahsoub provides key insights on MP, and Matt Stannard walks us through two innovative and outcomes-focused examples of the MP in action.


inline Key Takeaways

Data import is based on keys and targets. You import based on a common key in GA and the uploaded .csv, and populate dimensions and metrics in GA based on the key match.

Import custom dimensions (and metrics). In addition to importing built-in dimensions (such as product category), you can import custom dimensions (such as product size or article author). The import of custom dimensions was formerly referred to as dimension widening, since the process adds dimensions to user, hit, and product data that has already been recorded.

Data import retroactivity. If you’re using Analytics 360, you have the option of query-time import for retroactivity. In GA Standard, data import is performed at processing time, that is, on a go-forward basis only. Cost-data import does apply retroactively, even in GA Standard.

Key refinements for content imports. The Page dimension usually serves as the key for content imports, but in many cases, the key value in the .csv file matches only a portion of the Page dimension value. In this case, you can apply a regex or query refinement to extract only a part of the Page value as a match against the key value in the import file.

Import campaign parameters based on campaign ID. If you prefer not to display the utm_medium, utm_source, and utm_campaign parameters directly in your URLs, or if you’re working with an advertising platform that allows only a single parameter, you could add a single, discreet utm_id parameter, which will populate the ga:campaignCode dimension that you can use as a key for importing medium, source, and campaign.

Import cost data from advertising channels. You can import cost data from other advertising channels such as Bing or Facebook and compare metrics such as cost per conversion or Ecommerce transaction for each of the channels, and for AdWords, whose cost data you can import instead using the autotagging automation without any manual importing.

While the combination of metric and source values serve as the key in the schema for cost data import, the campaign value acts as the de facto key in many cases, so you must ensure that any campaign value that you’re importing from the .csv exactly matches a campaign value that you’ve already populated in GA.

Measurement Protocol. You can use the Measurement Protocol to send data through HTTP requests from an environment where the GA tracking code or SDKs cannot run, such as a Windows or Android app or a kiosk.

inline Actions and Exercises

  1. Review the data import scenarios in this chapter. Make a plan for any of the imports that would help your analysis. Because it’s immediately relevant for many people working with GA, cost data import may be the first import to perform.
  2. Review the custom dimensions that you identified in Chapter 12, “Implementation Customizations.” Some of the custom dimensions that you might need (such as author and category) may be easier for you to populate into GA through an import rather than going through your development team to populate data layer variables that you read into a GA tag in Google Tag Manager. If the import would, in fact, offer a simpler and faster option, even in just the near term, plan accordingly.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset