Chapter 8. Dealing with the Ever Constant Tides of Change

It WAS Working!

Number 1 claim when things stop working: Nothing Changed!

Number 1 reason when things stop working: Something Changed.

Change within a Streamed environment often has a far reaching impact. Changes come in two types: expected, and unexpected. Even the simplest of changes can bring the distributed environment to a halt; and unexpected changes can be the most detrimental. In the first section of this chapter, we will look at the impacts of, and dealing with, expectedly changing your existing Streamed environment. The focus will be on planning expected changes and knowing what to do to minimize their impact. The second section will address what to look for when things "stop working" due to unexpected changes . It will take you through techniques for identifying and troubleshooting issues with Streams that could result from such changes. Finally, the third section will give you some out-of-the-mainstream tips and tricks, and a brief introduction to additional tools.

In this chapter, we paddle through:

  • Change Planning
  • Troubleshooting unexpected changes and errors
  • Helpful troubleshooting Tools
  • And couple of tricks and tips

Affecting expected change effectively

The most consistent thing in life is change. Eventually, you will need to modify your Streamed environment or something in it. Big or small, the modification must be planned (back to that old design thing again) and the ramifications identified. Some changes may be external to the database but still impact on your Streamed environment (for instance IT notifies you that the allocated networking bandwidth between your databases will be reduced by 25% to accommodate a new application being added to the environment). Other changes may be required to support new and improved features for a distributed application (adding or modifying database object structures such as tables, view definitions, PL/SQL code, and so on), or an additional site (to accommodate a new region). And then of course, there are always the Oracle upgrades. As you become more familiar with the inner workings of Streams, you develop the ability to foresee how certain changes will impact on the overall Streamed environment as well as each component within the environment. And, most importantly, how these changes impact other components that do not change.

Changing States: Starting and stopping processes

One rule of thumb for affecting changes to the Streams processes is to configure and start and stop the processes in a specific order. You want to make sure your receiving processes are up and ready when you start the process that sends to it; a comparison that could be drawn is not throwing the ball to the home plate before the catcher gets there. This will help avoid the possible loss of events as mentioned in Chapter 4, Single-Source Configuration.

Start the processes in the following order:

  • Apply
  • Propagation
  • Capture

Stop the processes in the follow order:

  • Capture
  • Propagation
  • Apply

Another rule of thumb is to avoid collecting and sending changes for an object that is in the process of being configured for replication. This too can result in a loss of events. If you are adding a rule to an existing process, stop the process cleanly first. Once the rule has been successfully added, start the process and it will pick up where it left off. For Capture processes, this does require that all redo/archive from the Capture process REQUIRED_CHECKPOINT_SCN (or FIRST_SCN , if REQUIRED_CHECKPOINT_SCN is 0) be available to LogMiner.

Database changes

The key to affecting database changes is to ensure that everything is kept synchronized between the Streamed sites. The Streams Capture, Propagation, and Apply processes can be configured to handle DDL propagation as well as DML propagation, so DDL changes can be made at a primary master site and propagated via normal Streams processing. However, it may be necessary to coordinate the changes via other means.

Structure changes to existing objects

If DDL changes are necessary, make sure the Streams processes are configured to include DDL changes. If they are not, the DDL change must be manually made at each site at which the object is streamed.

If you intend to make object structure changes only at certain sites in your Streamed environment and not others, you must remember that by not applying a DDL change to all Streamed sites, the object behavior in the system will differ between the sites. Make sure this is well documented, so that when troubleshooting you are expecting the correct behavior from the object at any one site.

For tables, if a DDL change is not affected at all streamed sites hosting the table, the Apply processes where the DDL change is not made will break, and potentially, existing conflict resolution. You must make sure you put rule based transformation in place to accommodate a structure change not made at that site, as well as update any conflict resolution that may be affected by the structure change.

Data changes — beware the bulk load!

Bulk loads can generate massive amounts of redo that, when in a distributed environment, may bring the system to its knees. To avoid this, many choose to bypass Streams and run the bulk load at each site. To do this, make sure FORCE LOGGING is disabled on the databases to be loaded. You can then use the UNRECOVERABLE option in SQL*LOADER, or specify NO LOGGING for your load operation. This keeps the load from recording entries in the redo. If the entries are not in the redo, the Capture processes will not pick up the changes. Note that if you have synchronous capture configured, either remove the synchronous capture, or disable the capture triggers. Once the load is complete, remember to re-enable FORCE LOGGING on the database, and rebuild/re-enable your synchronous capture triggers. These methods are fine as long as the following precautions are taken:

  • No changes are made at any site in the Streamed environment that must receive the load once the loading begins
  • The bulk load is accomplished in the exact same manner at all master sites
  • Any errors that occur during the bulk load at each site are rectified before Streaming is re-enabled
  • All data is identical between master tables once the loading is complete; the number of records, and data values for each record are the same, and so on.
  • In the event that some secondary master sites may have only subsets of the data loaded at a primary master, ensure that the secondary site data subset is complete when compared to the primary master site(s)

Otherwise, the data will not be synchronized and could result in erred or lost apply transactions (a.k.a the dreaded data divergence).

Expanding your Streamed environment

It often becomes necessary to expand an existing Streamed environment to accommodate new sites, or database objects. You pretty much use the same approach to expand, or add to a Streamed environment as you would for its initial creation. You want to make sure that you do not (or do) overwrite existing objects, and in some instances you may need to manually prepare objects at the capture site for reinstantiation.

As with your initial design, it helps to break down the changes needed into "single-master to single-destination views", using the Streams Site Matrix.

Let's look at two examples of expanding a Streamed environment. First, we will expand a Single-Source environment to include an additional Master site. Second, we will add a table to a Replicated Schema.

Example: Adding a Master Site

We start with our Streams Site Matrix that shows us what our current Streams environment looks like.

Example: Adding a Master Site

Our current environment is Single-Source where MS1 is the master source site and SS3 is a secondary destination site. At MS1 we have a Capture process HR_CAPT, and a Propagation process HR_PROP_SS3 that propagates from the HR_CAPT queue to the HR_MS1_APPLY queue at SS3. At SS3, we have an Apply process HR_MS1_APPLY that has Site Priority conflict resolution defined where MS1 changes have priority over SS3 changes (just in case someone makes a change on SS3 that they really weren't supposed to make).

We want to add a master site MS2 that will have a Master-to-Master relationship with MS1, and a Master-to-Secondary relationship with SS3. We are going to choose to use the existing queues on MS1 and SS3 for MS2 stream processes (however, you do have the option to create a separate set of queues if you wish — just remember to add queue creation to your steps).

We are going to choose not to propagate DDL changes from MS2. This means that DDL changes can only be sent from MS1. By only propagating DDL changes from one master site, we are mimicking the old Master Definition Site architecture of Advanced Replication. If you choose to allow multiple masters to propagate DDL changes, you must implement conflict resolution to handle situations where DDL changes between masters conflict.

First, we will look at the "single source to single destination view" for MS1 to MS2.

MS1 already has a Capture process and queue for the HR schema, so no additions are needed there. One thing we need to consider is conflict resolution, since changes can be made at either MS1 or MS2. In this case, we will use Latest Timestamp (the latest change is applied if there is a conflict). This means that each of our tables in the HR schema must have a column that records the time the change was made (this opens up a whole new can of worms! BUT, because we sat down and worked out our design with the Streams Site Matrix, we are forewarned and thus forearmed). Time zone time differences need to be taken into account here. We need to either make sure the DB_TIMEZONE for all databases in the environment is the same, or we convert the time to equivalent time zone values when the record is created/modified, or during conflict resolution. We will also need to make sure that any application inserts/updates to the HR Schema tables update this new column. We can do this by creating a before trigger that sets the value of the column on insert/update to make the change transparent to any applications with code that manipulates data in the tables. Otherwise, application code must be modified to include the column in the DML operation. To configure Latest timestamp resolution, we would use the MAXIMUM resolution method referencing the time column.

To configure replication from MS1 to MS2, we will need to:

  • Create the database on MS2 and configure for replication following the steps listed in Chapter 3
  • Add timestamp columns and triggers to all HR tables at MS1
    • DDL propagation from MS1 to SS3 will automatically push these new columns to SS3
    • By adding the column and trigger to the tables before instantiating MS2, they are automatically picked up via the instantiation
  • Capture MS2 SCN at this point to be used on MS1 and SS3 in later steps to avoid change loss should changes be made on MS2 before you complete
  • Configure an Apply process at MS2 (HR_MS1_APPLY)
  • Add conflict resolution at MS2 for the Apply process using the timestamp column as the resolution column
  • Configure a Propagation process from MS1 to MS2
  • Prepare the HR Schema on MS1 for instantiation
  • Add/instantiate the HR Schema on MS2; including supplemental logging for the conflict resolution columns; privileges for the apply user and so on
  • Set the MS1 instantiation SCN at MS2 if it was not accomplished via the above instantiation method
  • Start the Apply process on MS2
  • Start the Propagation process from MS1 to MS2
  • Start the Capture process on MS1 (if stopped)

At this point our Streams Site Matrix looks like this:

Example: Adding a Master Site

Next we look at the "single source to single destination view" from MS2 to MS1 (the opposite path).

Because we have created the HR schema on MS2 as part of our set up from MS1 to MS2, we just need to handle changes going from MS2 to MS1.

To configure replication from MS2 to MS1 we will need to:

  • Configure the Capture process on MS2 (it should not be started)
  • Configure an Apply process on MS1 for MS2
  • Add conflict resolution at MS1 for the Apply process using the timestamp column as the resolution column (for the DDL LCRs, you can use the DDL_HANDLER parameter of the Apply process to define a "conflict handler" for DDL changes)
  • Configure a Propagation process from MS2 to MS1
  • Prepare the HR Schema on MS2 for instantiation
  • Instantiate the MS2 SCN on MS1
  • Start the Apply process on MS1
  • Start the Propagation process from MS2 to MS1
  • Do not start the Capture process on MS2 yet (we still need to configure SS3)

We have now completed the design for the replication between MS1 and MS2. Our Streams Site Matrix now shows:

Example: Adding a Master Site

Let us turn our attention now to the "single source to single destination view" for MS2 to SS3.

SS3 only receives changes, it does not send them. This means we only need to create an Apply process for the changes coming from MS2. This is the easy part. Coordinating changes from two master sites is a little trickier.

If changes are made directly on SS3 in the As Is configuration, they are overwritten by changes from MS1 if a conflict occurs; easy enough. However, we now have to receive changes from MS2 as well. The expectation would be that if a change is made directly on SS3, it would be overwritten by an MS2 change if a conflict between the two occurred. However, what happens if the conflict arises as a result of a change on SS3 coming from MS1 that did not get toMS2 before the same row was changed at MS2 by another user and sent to SS3 (did that make your head hurt)? Should the MS2 change win on SS3, or should the old MS1 change win? In our master-to-master relationship between MS1 and MS2, we determined that the most recent change wins. So, not only do we need to implement Site priority conflict resolution to handle conflicts between direct SS3 changes and each master, we now need to also evaluate if the conflict is a result of a change from the other master. Take heart! There is actually a way to do this. User defined conflict resolution allows us to combine Site priority and Latest Timestamp evaluations to yield multiple levels of resolution. The user defined conflict resolution function first evaluates the site that created the change on SS3. The site value can be passed with the LCR via a tag or a column. If the originating site was SS3, then the record is overwritten by the master change. If the originating site was one of the masters, the change is evaluated for Latest Timestamp. The oldest change is discarded and the newest change is applied/kept. Keep in mind here that time zone differences need to be taken into consideration at this point. If the timestamps are for different time zones, they will need to be converted to equivalent times.

Believe it or not, this approach is pretty straight forward. But think of what it would be if you had more than 2 master sites sending changes and you had to apply multiple levels of conflict evaluation. The more masters, the more complicated and unmanageable conflict resolution becomes. That is why Oracle (and an architect that wishes to maintain his/her, and other's sanity) recommends keeping the number of master sites in a distributed environment to a minimum. It also helps to keep the number of conflict resolution evaluation levels used between master sites to a minimum as well.

So, to complete our addition of MS2 to our environment, we need to do the following on SS3:

  • Configure an Apply process for MS2 changes
  • Add user defined conflict resolution to the MS2 Apply process that evaluates first by Site Priority, and then by Latest Timestamp.
  • Redefine the conflict resolution for the MS1 Apply process so that it evaluates first by Site Priority, and then by Latest Timestamp.
  • Instantiate the MS2 SCN on SS3
  • Start the MS2 Apply process on SS3
  • Start the Propagation process from MS2 to SS3
  • Start the Capture process on MS2

We now have a completed Streams Site Matrix from which we have gained great insight as to the more effective and efficient way to affect the change. You will notice that even though we are not changing replication from MS1 to SS3 (insert "but nothing changed between MS1 and SS3, so why did it break?" question here). However, we found out that we do indeed need to change components of the MS1 to SS3 replication to accommodate the new MS2 site. If we are focusing on only adding the new master site, the addition would end up causing a cascade of apply errors at SS3 which we do not see with just the MS1 master.

Example: Adding a Master Site

Time based values used for comparision and time zones.

One point came out of this example that bears additional discussion: time zones. It is this author's overwhelming desire to call a special meeting of the UN and have us all agree that life and the universe would be so much easier to deal with if we all just got on the same time. 8 o'clock am is 8 o'clock am regardless of where you are in the world. Think of all the brain power and sleep that is lost over keeping track of 24 different time zones that all change, at different intervals, on different dates depending on the year. Look at how many patches Oracle has had to put out just to keep up with them all. After all, if those wonderful planetary inhabitants South of the equator can celebrate Christmas in the middle of summer, surely those of us in the Western continents can handle going to work at noon (actually, many of us already do). So why not unite! One planet, one time, one people!

However, until this happens, we must accommodate for the ever changing time zones and the impact on time-based conflict resolution. As mentioned above, you will need to make sure the time values used for conflict resolution are equivalent by time zone. Your most efficient practice is going to be to store all of the time-based columns in the database at the same time zone. If needed, convert the time value via a trigger or code prior to inserting or updating the record. This way you only have to convert once, when you store, not every time you retrieve. In a distributed environment, it is highly recommended that you set the DBTIMEZONE on all databases to the same time zone. UTC/GMT is recommended (with no Daylight Savings time change). Then, incoming time-based values can be converted from its known time zone to the database time zone. For example, use the FROM_TZ/AT TIME ZONE statement:

FROM_TZ(CAST(<my_date_type_data>) AS TIMESTAMP), ('<my_time_zone_name>')) AT TIME ZONE (DBTIMEZONE)

Note: You must convert your time-based data type to a simple TIMESTAMP if it is not already one. This is done with the CAST (...) AS TIMESTAMP.

Note

For more information on working with time-based values and time zone conversions, please reference the Datetime Data Types and Time Zone Support Chapter in the Oracle Database Globalization Support Guide.

Example: Adding a table to a replicated schema

Our next challenge is to add a table to a replicated schema. Since our sites are not going to change, we can refer to our Streams Site Matrix to help us keep track of all the sites where we need to add the table and configure it for replication. We won't be making any changes to the matrix. In our previous example, we extended our streamed environment to include a second master site. We will use this extended environment as our base for adding a table to our HR schema.

The first step is to understand which commands can be issued once and propagated throughout the environment, and which commands must be issued at each site.

We know from our Streams Site Matrix that the HR Schema is hosted at MS1, MS2, and SS3.

We know that the create table command is DDL and when issued at MS1 will be propagated to MS2 and SS3. However, it would not be propagated if issued at MS2.

We know that DBMS_STREAMS_ADM procedures needed to configure the table replicaton are not DDL and therefore are not propagated.

The table must exist on the local database to be referenced in a DBMS_STREAMS_ADM procedure call.

Since this is a new table, we do not need to worry about data synchronization for this table between sites. However, we should take precautions to make sure no one adds data to the table at any site until we have added the capture rule to the Capture process for the table at each site.

From the above analysis of what we know, we can conclude:

  • The DDL to create the new table (and associated indexes and triggers) can be issued at MS1 and propagated to MS2 and SS3
  • Once the propagated DDL has been applied at MS2 and SS3, DBMS_STREAMS_ADM procedures must be issued at each site to configure the appropriate capture and Apply process rules for the table at each site
  • To ensure the table is empty of data when the capture rule is added to the HR_CAPT Capture process, we can issue a truncate table command on the table just prior to adding the capture rule
  • As soon as the capture rule is added to the HR_CAPT process, we want to get the SCN at the capture site to avoid losing any apply for changes made after the Capture process is created but before the table apply rule is instantiated

To accomplish adding the table to the replicated Schema, we can do the following:

  • Issue the DDL to create the table and any associated indexes and triggers at MS1
  • Allow the DDL to be propagate and applied at MS2 and SS3
  • Stop HR_CAPT processes at MS1 and MS2
  • Stop all Propagation processes
  • Stop HR_MS1_Apply process at MS2 and SS3
  • Stop HR_MS2_Apply process at MS1 and SS3
  • Add apply table rule to all Apply processes
  • Add the appropriate apply rule conflict resolution for the table to all Apply processes
  • Add propagation table rule to the all Propagation processes
  • Add capture table rule to all Capture processes
  • Prepare the table for instantiation at MS1 and MS2 (this is done automatically if you used the DBMS_STREAMS_ADM package to add the rule to the Capture process)
  • Instantiate MS1 SCN at MS2 for the table
  • Instantiate MS1 SCN at SS3 for the table
  • Instantiate MS2 SCN at MS1 for the table
  • Instantiate MS2 SCN at SS3 for the table
  • Start all Apply processes
  • Start all Propagation processes
  • Start all Capture processes

Shrinking the Streamed environment

Shrinking a Streamed environment is much less complicated than extending it. It is accomplished by simply removing the rules/rulesets that govern the capture, propagation, and Apply processes for the replication level and site.

Removing table, scheme, and tablespace level replication from Streams

DBMS_STREAMS_ADM.REMOVE_RULE or DBMS_RULE_ADM.DROP_RULE

As mentioned in Chapter 6, remember to use the same package to remove the rules as used to add them. Using a standard procedure for creating Streams rules will help with this. For example, you always use DBMS_STREAMS_ADM to configure replication capture, apply and propagation rules, rather than DBMS_RULE_ADM

Removing a site from a Streamed environment

There may be times when a site becomes so badly desynchronized within a Streamed environment, that it is more efficient to remove the site entirely and rebuild it. Or, it is no longer needed in the environment.

Use DBMS_PROPAGATION_ADM.DROP_PROPAGATION to remove Propagation processes to the database to be rebuilt/removed. It is more efficient to drop and recreate a Propagation process than it is attempt to resynchronize it with a capture and/or apply queue and process that has been dropped and recreated.

Use DBMS_STREAM_ADM.REMOVE_STREAMS_CONFIGURATION locally at the database that is to be removed. It leaves the database and the Streams users intact, but removes the objects created for Streams processes (queues, processes, rules, and so on).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset