© Roger Cornejo 2018
Roger CornejoDynamic Oracle Performance Analyticshttps://doi.org/10.1007/978-1-4842-4137-0_7

7. Building the Model and Reporting

Roger Cornejo1 
(1)
Durham, NC, USA
 

In the preceding chapters, I introduced the individual steps of the DOPA process. It is finally time to put it all together, build the model, and report the results. As I have repeated throughout the book, the DOPA process is dynamic. You essentially create a new, unique predictive model with each execution of the code by altering the model inputs as you refine your tuning efforts. It is also versatile because the data can be subset in such a way that it makes the analysis easy and clearly shows the metrics that enable you to discover the cause of the performance problem.

In this chapter, I’ll discuss the model-building process including the selection of variables used as input and how to make those choices and the output views.

Bringing Together the DOPA Process Components

In prior chapters, we detailed each of the steps/components of the DOPA process:
  • Selecting the metrics data sources

  • Normalizing the data: unpivoting to form key-value pair structure

  • Removing outliers for normal range calculations

  • Establishing normal ranges for metric values

  • Integrating metrics with taxonomies

  • Flagging metrics that are outside of the normal ranges

The DOPA process I’ve developed uses a single SQL statement to accomplish these steps with the result being a distillation of the set of all metrics within the retention period down to only the metrics with unusual values (mostly high values) for a given problem interval. This chapter will detail the various methods used for this distillation which is essentially a process of subsetting the data. Before discussing the methods however, I want to explain in more detail the subsetting itself.

Figure 7-1 provides a graphical representation of the subsetting that occurs as part of the DOPA process.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig1_HTML.jpg
Figure 7-1

A graphical representation of the subset refinement within the DOPA process

The large outer box represents the set of all metrics available within the retention period. Since I’m currently only instrumenting AWR metrics, it is more specifically the set of all AWR metrics within the retention period.

From that all-encompassing starting point, a subset of metrics are normalized and included in the DOPA process. (The details of this data preparation step are covered in Chapter 3—Data Preparation. The taxonomies are built from the normalized metrics as well and are covered in detail in Chapter 6—Taxonomy.)

When I am ready to build a model and report, I further subset the normalized metrics as follows.

First, I subset on the number of days back. The reason for subsetting the normalized metrics by the number of days back is that in some cases there is a month or more of AWR metrics and I usually don’t need to look at all of them.

A further subsetting involves choosing a date range from which to calculate the normal ranges; this is a smaller subset that focuses in on a period that is “nonproblematic.” (The details regarding normal ranges and outlier removal are in Chapter 4—Statistical Analysis, and the thought process that guides your selection of a normal range data set is covered in a subsequent section of this chapter [see section “Variables for Building the Model”].) Outliers excluded from the normal calculation are shown as a smaller box inside this set.

The problem interval represents the other subsets of the chosen date range. The DOPA process analysis relies on a comparison of the problem interval to the norms established using the “nonproblematic” interval. Metrics found to be outside of normal range based on this comparison will be “flagged” and are represented as a smaller subset within this interval. The metric taxonomy is merged into the analysis by this point as well.

When reporting the results of the model, I choose to view the flagged metrics using one of two views, either the Category Count View which yields a high-level picture or the Metrics Aggregate View for a metric by metric detailed view of the model.

For drill-down analysis purposes, I may also use a third view, the Metrics Time-Series View (usually for a single metric), to see how a particular metric has behaved over time. This view will allow me to see all the data for a particular metric, not only the flagged values, for a specified time range. I often graph this in Excel so that I can visually see the metric trend over time.

The three views are detailed in subsequent sections of this chapter.

Some minds may better understand the DOPA process as a data flow, so I developed Figure 7-2 to depict the DOPA process as a data flow.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig2_HTML.jpg
Figure 7-2

The DOPA process depicted as a data flow

The data flow figure can be understood as follows:
  • Metrics are normalized from the set of metrics to be instrumented.

  • These normalized metrics are unioned together for analysis.

  • The taxonomies are built from the normalized metrics prior to running the DOPA process.

  • From the unionized normalized metrics, outliers are computed and normal ranges calculated.

  • The main DOPA feature selection process builds the model by merging in the unioned metrics, the normal ranges, and the taxonomy, then does some basic flagging, and exposes the results of the model in one of the three views: Category Count View, Metrics Aggregate View, and Metrics Time-Series View.

Now that you have seen an overview of the DOPA process as both a series of refined subsets of metrics and as a data flow, I will discuss the SQL code I developed and then how to actually begin building metrics-based models of specific Oracle performance problems.

Implementing the DOPA Process: The SQL Code

There are a few relevant SQL statements involved with the DOPA process that I developed. This work is offered as a professional courtesy, “as-is,” and without warranty. I disclaim any liability for damages that may result from using this code. I developed this code mostly under Oracle versions 11g and 12c; it will not work in Oracle version 10g since some AWR changes in the more recent versions were not available in 10g. It is a simple task to make the modifications needed to run this under 10g [DBA_HIST_IOSTAT_FUNCTION is not in 10g and thus needs to be removed from instrumentation to run this under 10g]. Further, I use this code from Toad which understands the substitution variables prefixed with a “:”. Since I’ve not run this from other environments, I don’t know how it will behave there, although I suspect it will run as-is in, say, SQL Developer or other similar tools.

Precondition to Running the DOPA Process

When I encounter a database where I’ve not yet used the DOPA process, I first create the taxonomy table and populate it with data using the create table statement in the file:
  • AWR - flag - all metrics - taxonomy.sql

The result of running this SQL statement is a populated table called METRIC_TAXONOMY.

Running the DOPA Process

When I’m ready to build a model and run the dynamic analysis, I use one or more of the following SQL statements based on which view is of interest at the time. The SQL code in these files is essentially the same except for the particular “view” they select from [the “view” is actually a named subquery in the WITH clause part of the SQL statement].
  • AWR - flag - all metrics – Category Count View.sql

  • AWR - flag - all metrics – Metrics Aggregate View.sql

  • AWR - flag - all metrics – Metrics Time-Series View.sql

The output from these views is described in detail in the following section.

Documenting Model Parameters

If I care to document the parameters I’ve used for a particular model, I run the following SQL statement:
  • AWR - flag - all metrics - Model Parameters.sql

Example output from running this SQL:

../images/469031_1_En_7_Chapter/469031_1_En_7_Figa_HTML.png

Variables for Building the Model

As I have emphasized, the DOPA process is a dynamic process. It involves running the code against a set of persisted time-based metrics and refining the set of all metrics within the retention period down to the metrics which have unusual values and are likely to be key influencers impacting the performance issue.

By altering variables to meet the needs of that run, a unique model is built. Each run of the code builds a unique model and yields a unique output in the form of a view. These views are easily interpreted and useful for predicting the cause of performance issues. By iterating through this model-building process, a tuner is able to gain a good picture of how the database is performing and where performance issues are occurring.

The variables/inputs chosen will determine the output. Therefore, a careful choosing will yield the best results. In the next section, I describe the set of variables I have included in my code. When writing your own code, you may choose to use more or less variables as per your needs. Later in the chapter, I will discuss in greater detail the thought process behind building the model and how and why you might want to alter these variable inputs to maximize the predictive value of the model-building process.

Here is a list of the input parameters that can be modified:

Date ranges impacting the model : As described in the DOPA process subsetting in the preceding text, there are three sets of date ranges of interest:
  1. 1.

    The total time period that you want to examine

     
  2. 2.

    The time period during which the problem manifested

     
  3. 3.

    The time period from which normal ranges will be calculated

     
These date ranges are essential to the model-building process as they define the set of time-series metrics which will be included in the various parts of the DOPA process. The following descriptions provide the essential details:
  1. 1.

    The total time period that you want to examine (a.k.a. Date Range of Metric Data to Examine): This will be the beginning and end date/time intervals from which you pull the metric values to examine. Since Oracle persists AWR metrics for a set retention period (typically 1–4 weeks and even up to 120 days in environments I’ve worked in) and you may not need to examine all the data in that period, this should be subset to a reasonable number of days. I currently use the number of days back to examine and set this time interval ~one week of historic data. You can just as easily choose to do a few hours or several days’ worth of metric collection.

     
  2. 2.

    The time period during which the problem manifested (a.k.a. Date Range for Problem Interval): This will be the beginning and end date/time intervals during which the problem was reported to occur. It will necessarily be within the Date Range of Metric Data to Examine.

     
  3. 3.

    The time period from which to calculate the normal ranges (a.k.a. Date Range for Establishing Normals): This will be the beginning and end date/time intervals that will provide the metric values for establishing the metric normal ranges. This must also be a subset of the larger Date Range of Metric Data to Examine. It usually makes most sense for the Date Range for Establishing Normals to be outside of the Date Range for Problem Interval since we want this to represent a nonproblematic operation of the database.

     
Taxonomies of interest : I have two taxonomies currently implemented (infrastructure and Oracle), so I have put this as a variable so that I can easily use one or the other. If you were to implement other taxonomies, they would be subset here as well. To leverage the most out of the taxonomies, I allow choosing a taxonomy via subsetting on the following taxonomy columns [described fully in Chapter 6—Taxonomy]:
  • Taxonomy type: the name of the taxonomy

  • Category: the high-level category within that taxonomy

  • Subcategory: the subcategory within the higher-level category

Remember that in order to make use of the taxonomy for a given database, the taxonomy will need to be built prior to the run (refer to Chapter 6 for instructions on how to do this) .
  • Metric sources : This variable allows you to choose the metric sources from which you will pull data. You can choose any or all of the metric sources that have been implemented.

  • Metric name : By setting this as a variable, I can subset on a single metric name or set of metric names using the wildcard character, “%”. This is useful for deep-dive analysis once you have narrowed in on the problem area.

  • Outlier sensitivity settings : For identifying outliers, my default is to identify as an outlier any value 1.5 times the interquartile range (IQR) below Q1 and above Q3. I have coded the “IQR factor” as a variable, thus allowing me to tweak the outlier removal sensitivity if/when there is a need to do so.

  • Flagged values : The DOPA process can be used to report all metric values, not just the flagged metric values. To accomplish this in my code, I indicate whether I want to include only the flagged metric values in my report (Y) or all the metric values (N).

  • Flag ratio : This is a calculated number representing how many standard deviations from normal the metric value is (i.e., how far “out of whack” the metric value is). I implemented this as a variable so that I can modify the flag ratio to suit my needs. Using this variable, I can subset data based on the flag ratio I choose. The flag ratio is explained in detail in Chapter 4 on statistics.

Reporting

Running the code once you have set your variables will produce a predictive model in tabular format. You will choose which “view” to display [I use the term “view” in a generic sense—not meaning an Oracle object know as a view]. The DOPA process output can be presented in one of the following three views:
  1. 1.

    Category Count View

     
  2. 2.

    Metrics Aggregate View

     
  3. 3.

    Metrics Time-Series View

     

Each of the views is described below. Following the description of each view, there is a table which displays a complete list of the column names in that view and a short description of it. You may choose to include more or less columns in the views you develop for this process. Because I’m using Toad to format the output, the columns included in the view can be moved around and/or removed and this is useful for analysis too. It allows me to see more or less information based on the need without having to change the code.

The Category Count View displays the results according to taxonomic groupings. This is a helpful way to view the data to gain a bird’s-eye view of what is going on in the database and whether there is one or more than one area of trouble. Figure 7-3 describes the columns of this view.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig3_HTML.jpg
Figure 7-3

Column names and descriptions for Report: Metrics Category Count View

The Metrics Aggregate View displays the number of flags for each metric, number of intervals flagged, average flag value, flag ratio, and average all. I use this view to provide an overall picture of what is going on with respect to unusual performance metrics during the problem interval. It is concise and easy to interpret because the metric information has been consolidated—there is only one line in the table for each metric. Figure 7-4 describes the columns of this view.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig4_HTML.jpg
Figure 7-4

Column names and descriptions for Report: Metrics Aggregate View

The Metrics Time-Series View displays metric data for each time instance whether or not it was flagged. This view is very useful for deep-dive analysis once a problem area has been identified. It gives the data for every instance during the interval. Figure 7-5 describes the columns of this view.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig5_HTML.jpg
Figure 7-5

Column names and descriptions for Report: Metrics Time-Series View

As I said before, you might decide to add other views of interest, but the abovementioned views are the ones I have coded in the DOPA process for my own use. At the end of this chapter, you will see examples of these views and how they are used for the analysis.

General Instructions for Building and Tweaking the Model

Since building a model is running the DOPA process for a set of input conditions, I am going to discuss the thought pattern I use for selecting the subset conditions to address various problems. Later in this chapter, I show an example of how the model is used to determine next steps based on the results of the first model. In Chapter 8 on case studies, there are a good number of real problems and how the DOPA process was used to solve them. These case studies provide a more detailed picture of the DOPA process in action.

Upon learning of a performance problem on a particular database, I collect as much relevant information as I can from the user. You can refer back to Chapter 2 on gathering problem information for more specifics on the information I try to obtain. One important thing to consider is that it can be very helpful for the normal range calculations to get a date/time range not only for the problem interval but also for when the performance was normal or acceptable. Having gathered as much relevant information as possible, I set the variables for building the model. The decision-making process for which inputs are selected is discussed in the next section.

How to Choose Inputs/Set Variables

  • Date Range of Metric Data to Examine: The date range used for collecting metric data will depend to some extent on how well-defined the problem interval is. The objective is to choose a date range that will encompass not only the problem but a period of operation during which the problem did not exist. This date range will include, but be larger than, the next two intervals you will set. I often use a date range as far back as I have history or approximately a week if there is more than that available so that I can get a good handle on what was “normal” before the problem presented itself. I don’t usually go further back than ten days because this gives me more data than I need although there may be times when this is desirable (see case study chapter 8 for an example of this). Another reason for subsetting the normalized metrics by the number of days back is that in some cases there is a month or more of AWR metrics and I usually don’t need to look at all of them.

  • Date Range for Problem Interval: I will choose the date/time interval where the performance problem occurred only as wide as necessary in order to capture the problem. I don’t want to cast my net too far and introduce unnecessary data “noise” into my analysis. Typically, I will use the date/time range provided by the client reporting the performance issue, unless a time-series analysis shows that the interval was likely bigger or smaller; in these cases, I’d rebuild the model using the adjusted date/time ranges.

  • If I’m not given an exact date time range for the problem, I usually select what I consider a slightly bigger time range and then narrow it in. For example, if I’m pretty sure the problem occurred on a specific day, I might use the entire day as the problem interval. I’ll run the code using defaults for the parameters and a Metrics Aggregate View. Then I’ll pick a metric from the Metrics Aggregate View that had the highest number of flagged metrics and the highest flag ratio and rerun the DOPA process using a Metrics Time-Series View to see the trend of when the metric went high. I can then use this information to narrow the date/time interval for the problem and take another pass at building the model. I usually wait to hear back from the client before making any judgments, but my analysis is usually spot-on.

  • Date Range for Establishing Normals: The date/time interval you select for establishing normal ranges will be a subset of the time interval you set for metric data collection (i.e., a subset of Date Range of Metric Data to Examine). It will likely throw off the model if you randomly select a normal range interval, so the best interval to use is a date/time interval when the system was behaving normally. The bottom line is you want the normal range time interval to represent a time when the database was operating without problems so that the metrics will be truly representative of “normal.” It is easier to determine a good interval when the problem has a clear starting time. If I am told it happened yesterday, for example, I would include all the metric data except for yesterday. If I am unsure of the time during which the problem occurred, I may use data from the weekend to establish normal values. Another way to discover a normal range interval is to look at the data from one or more time-series views and select a date/time range that appears normal from the data. For example, if Average Active Sessions were high during the problem interval, you could use an interval of low Average Active Sessions for the normal range interval. In cases where the normal range interval is not easy to determine, I might build the model a few times adjusting the normal range date/time interval until I obtained a set of flagged metrics that I could use for further analysis.

  • Taxonomy: As mentioned in the preceding text, I have two taxonomies currently implemented (infrastructure and Oracle). I use the infrastructure taxonomy as my default because this is the one I have spent the most time building out and refining. The taxonomy is only as useful as it is complete and accurate. If many of the variables are listed as “other,” it will be less useful than if all of the metrics are properly categorized. Use of the taxonomies allows you to bundle the metrics according to the taxonomy categories and thus determine the general category in which the performance problem is occurring (e.g., IO vs. CPU). This is particularly useful as a starting point for the analysis process. Once the type of problem is identified, it is possible to drill deeper with confidence that you are not missing important information or other problems. When the taxonomy is sufficiently developed, it is possible to gain an even better understanding of the problem by looking not only at the taxonomic groupings but the subcategory as well.

  • For example, if I’m seeing a lot of REDO metrics being flagged with high flag_ratios (i.e., more outside of normal range), then I might want to focus only on metrics that are in the Oracle taxonomy related to REDO and rerun the model subsetting on category = REDO as a refinement. Similarly, if analysis suggests to focus on a possible CPU-related problem, I can eliminate the “noise” created by the other metrics and look only at metrics related to the “CPU” category within the infrastructure taxonomy. Thus, the model will show the metrics that are most relevant to the problem area; this enables me to focus in on an area with a high degree of specificity.

  • Metric sources: I choose to look at all the metric sources on my first pass through a problem. Sometimes I exclude the dba_hist_latch source to get a faster model-building run. Also, with over 8,000 [often ill-documented] metrics in dba_hist_latch, I find latches to be more difficult to interpret. Normally, I will do several iterations without the latch because it runs faster. I usually put latches back into the analysis once I have homed in on the problem for a final build of the model.

  • The beauty of the DOPA process is that it allows you to use the full arsenal of metrics available from Oracle. This is extremely useful for discovering the area of performance problems. But once you have identified the kind of problem, the DOPA process also enables you to limit the metric sources in order to focus in and gain a very detailed picture of the problem area.

  • As your familiarity with the different metrics increases, you may choose to focus on particular sources. For example, dba_hist_sysmetric_summary has a lot of workload metrics (i.e., <metric A> per transaction and <metric A>per second) which can be useful for deep-dive analysis. Similarly, dba_hist_osstat has metrics that derive mostly from the operating system, so in cases where operating system issues appear most pressing, one could focus attention on these metrics.

  • Metric name: My default on first pass through a problem is to look at all metrics because I want my first pass to give me a broad understanding of which metrics are at play and what general area is having problems. As stated previously, I can subset on a single metric name or set of metric names using the wildcard character, “%”. I would typically subset on a single metric name once I have narrowed in on the problem area and want to do a deep-dive analysis and get metric trending patterns using the Metrics Time-Series View.

  • Outlier sensitivity settings: I use my default computation for outliers unless I have a reason to change it in subsequent runs. (My default is that any value 1.5 times the IQR above Q3 or below Q1 is identified as an outlier and removed—this is covered in detail in Chapter 4.) Sometimes the default IQR factor of 1.5 causes too many outliers to be removed, so I will bump the IQR factor up in order to limit the number of values that are removed. When I don’t want any outliers removed, I can bump the IQR factor up to 1000. When less outliers are removed, the normal ranges will be broader and less values will be flagged as abnormal. Conversely, if more outliers are removed, the normal ranges will be narrower and more values will be flagged outside of normal range. I will consider constraining the removal of outliers when the model produced swamps me with flagged metrics because when there are too many flagged metrics, the interpretation of the results is difficult.

  • One situation where I would want more outliers removed would be when I knew the typical absolute value ranges for a metric and the calculated normal ranges seemed unusually high. For example, I would expect to see ranges between 0 and 10 as normal for “Asynchronous Single-Block Read Latency” from dba_hist_sysmetric_summary. If I saw calculated values much higher than this for an upper limit on a particular run, I would want to either eliminate more outliers or find a different time interval that better represented a normal operation period. Note: If you are confident that the date/time interval used for the normal range calculations is representative of normal operating of the system, you won’t need to tweak this parameter.

  • Flagged values: My default here is to choose to show flagged values only. One instance, however, where it is very helpful to see all values is when I want to look at a particular metric’s trending over time. For this purpose, I will use the Metric Time-Series View and have all values displayed, not just the flagged metric values.

  • Flag ratio: Once again, the flag ratio is a measure of how many standard deviations from normal the metric value is. I start out using my default of zero for this variable so that I will see all metrics whose flag ratios are greater than zero, but there may be times I want to change this.

  • When the flag ratio is zero or close to zero, this means that the average value for the metric in the problem interval is just a little bit outside of normal range. The bigger the flag ratio, the further away from normal range the average metric value is. I would consider using a higher flag ratio if the model returns too many flagged metrics to easily analyze. By bumping the flag ratio up, I’ll see less flagged metrics, but this is desirable since the metrics with the higher flag ratios are most likely to be the key influencers on the performance issue(s) being examined.

Once the subset variables are selected and you run the DOPA process, you will have a predictive model to guide you in your next steps. Because each performance case is unique, there are many factors that will influence your model-building choices. Again, you may iterate through the model-building process several times to arrive at a model that seems to be predictive/concise enough to guide further analysis. Following is a simple example of running the DOPA process including the results of the first run and how that predictive model directed the next steps of the analysis process.

Example: Building the Model and Reporting

Here I will show a simple example of how the model is used to determine next steps based on the results of the first model. In Chapter 8 on case studies, there are a good number of real examples that give a more detailed picture of this process in action.

When beginning to investigate a performance issue, you can choose to start with a Category Count View which will help you understand the general areas where problems are occurring, or you can begin with Metrics Aggregate View which will give you the individual metrics that are most problematic. For this example, I chose to run the DOPA process and report using the Metrics Aggregate view. In this view, the table displays the metrics with the most flagged instances and highest flag ratio at the top of the list. Once I have that list, I go back and grab more detailed information for each of the top metrics and report that for the entire time interval. Figure 7-6 shows a portion of the Metrics Aggregate View which identifies dba_hist_osstat ‘segment prealloc bytes’ as the metric with the most flagged intervals and highest flag ratio. It was outside normal range for 10 of the 24 intervals.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig6_HTML.jpg
Figure 7-6

Metrics Aggregate View example

Now to take a closer look at that metric, I will run the code that produces the time-series view.

For this iteration of the model, I want to subset for just that metric and choose to see all values, not just the flagged values. An example of the output is in Figure 7-7.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig7_HTML.jpg
Figure 7-7

Metrics Time-Series View is useful for deep-dive analysis of a particular area

In Figure 7-8, I have graphed the same time-series data as shown in the table above so that the trend is more obvious.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig8_HTML.jpg
Figure 7-8

Metrics time-series graph example

Often with the metrics time-series graphs, the periodicity of the metrics becomes readily apparent. For example, it is easy to see usage patterns of daytime workloads vs. nighttime workloads and/or usage patterns of weekend workloads vs. weekday workloads. Another example of periodicity would be periodic high workloads due to MView refreshes or other batch procedures. This would be an important observation because many metric anomalies can’t be detected with constant [absolute value] thresholds. DOPA allows for a dynamic threshold for metric values since it is doing this comparison throughout the analysis.

Since I began with a Metrics Aggregate View for this analysis, I like to come back and confirm my analysis by looking at the Category Count View. As I said, it is possible to start the analysis with either of these views, but since I began with the one, I usually use the other to confirm the analysis. The Category Count View shows the taxonomic categories in which flags are occurring most frequently for a particular data set. To obtain a Category Count View, I run the model using the category count version of the code and subset again as I did for the Metrics Aggregate View. Figure 7-9 shows the Category Count View for the same data set used in the example described in the Metrics Aggregate View in Figure 7-1.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig9_HTML.jpg
Figure 7-9

Category Count View, example table

Even though you can see from this table that IO has the most flags, a graphical representation is effective in bringing home how great the difference is between IO and other areas of the infrastructure. I graphed the same category count results in a bar chart and that graph is shown in Figure 7-10. It is very easy to see from this graph that most of the flagged metrics are in the IO category.
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig10_HTML.jpg
Figure 7-10

Category Count View, example graph

In the next example, a Category Count View was again used, but this time it was organized according to the Oracle taxonomy. The results are shown in Figure 7-11. The REDO area is the most flagged category with 64 flagged metrics. [NB: The larger number of metrics flagged in category ALL is an indicator that more work is needed to refine this taxonomy.]
../images/469031_1_En_7_Chapter/469031_1_En_7_Fig11_HTML.jpg
Figure 7-11

Category Count View using Oracle taxonomy

Summary

In this chapter, I have tried to give a plan of attack for how to implement the DOPA process in a general way. The starting point is usually either a Category Count View or a Metrics Aggregate View. The results of that first model will guide you in making choices for the inputs and view for further iterations of the model-building process. The end goal of these iterations is to allow you to focus in on problem areas and gain greater clarity regarding the performance issues. The tuning professional’s skill comes into play in this process in that he/she will need to use good judgment to subset and interpret the data in order to discern a root cause and targeted solutions.

The next chapter provides multiple real examples with many nuanced uses of the DOPA process in action.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset