Subsetting — the micro side of replication

In the previous chapters, we looked at replicating whole table via table level and schema level replication. There may be times when you only what to replicate a subset of rows in a table to a site. Say for instance, your corporate headquarter's database has HR table data for all departments, but you only want to send HR data specific to location_id to that location's database. You can do this with subsetting rules. Subsetting is considered a table level replication, but includes a filter parameter that is applied to the process to include only those LCRs that meet that filter criteria. Along with the ADD GLOBAL, SCHEMA, and TABLE rules of the DBMS_STREAMS_ADM package, the ADD_SUBSET_RULE can be defined for the Capture, Propagation, and/or Apply process.

Subsetting will generate rather complex rule_conditions depending on the dml operation it is associated with. Because of this, it is recommended that Subset rules only be assigned to positive rulesets. Assigning Subset rules to negative rule sets could yield unexpected results.

An example use of the ADD_SUBSET_RULE is described below.

The HR.EMPLOYEES table has a department column DEPARTMENT_ID that references the HR.DEPARTMENTS table. The HR.DEPARTMENTS table in turn has a column that references the HR.LOCATIONS table.

Depending on how you have configured your Streams environment and your business rules, you may find that it is more advantageous to put the subset rule on the Propagation process, rather than the Apply process. Or you may find it more advantageous to put the subset rule on the Capture process. It all depends on what is most efficient for your environment. For our example, we are going to assume that we have created a capture queue (1500_CAPT_Q) that captures changes only for a particular location let's pick San Francisco, whose location ID is 1500. For the HR.EMPLOYEE table, we only want to capture HR records that are associated with LOCATION_ID 1500 for our capture queue 1500_CAPT_Q.

Subsetting — the micro side of replication

The main thing to keep in mind here is that the dml_condition works best with simpler evaluations formatted as they would be for the where clause of a SQL statement, but just not actually including the where key word. You also want the evaluation field(s) to be in the table you are sub-setting. Also make sure that supplemental logging has been enabled for the fields referenced in the dml_condition. We will see why when we look at the actual rule condition created. Now, if the table we were sub-setting were the HR.DEPARTMENTS table, we could simply supply the dml_condition as such:

dml_condition => 'location_id = 1500'

But not so fast (in a couple of ways)! The table we want to sub-set is the HR.EMPLOYEES table. It does not have a location_id. It does have a department_id from which we can derive the location_id, but it is a table "down the line". So, for a select statement to yield a result to determine if an employee record belongs to location_id 1500, we would need something like this in the where clause:

department_id in (select department_id
from hr.departments
where location_id = 1500)

Unfortunately the dml_condition does not handle complex filters such as this very well at all. Thus, we have to simplify the call for the dml_condition and allow it to return an "easy" value to evaluate. Sounds like a job for Super-Function! We can create a function that returns Y if the employee_id is related to the location id we are looking for, and N if it is not. We can then pass in the simplified function call with desired return value to the dml_condition to include in the rule_condition just as we might in the where clause of a SQL statement. Please note, even though the dml_condition is in a simplified format, you may still experience a noticeable slow-down of change capture (the other part of the "not so fast"—it really can be "not so fast"). It is highly recommended that you test the performance of a complex capture such as this and determine if performance impacts are acceptable. If not, look at using Tags (discussed later) and and_conditions as alternatives.

The complete code to set up a Single-Source Subsetting on the HR.EMPLOYEE table can be found in the SubSetting.sql code file. In the following discussion, we are only going to delve into the creation of the Subset Capture process creation to get a good understanding of how the dml_condition can be used for complex conditions. So, synch up your life-vest; here we go.

First we want to make sure the capture user STRM_ADMIN has the necessary privileges to select the data in question via PL/SQL. This means explicitly granting all privileges on HR.EMPLOYEES, HR.DEPARTMETNS, AND HR.LOCATIONS to STRM_ADMIN. If STRM_ADMIN were only to be the Capture user, then the SELECT privileges on the above tables would suffice.

grant all on hr.employees to strm_admin;
grant all on hr.departments to strm_admin;
grant all on hr.locations to strm_admin;

Next, we create a function under the STRM_ADMIN schema that performs the complex evaluation for us and returns either a Y if the employee_id belongs to location_id 1500, and N if it does not. The function will also return N in the case of error. To make the function reusable, we will also pass in the location_id value we want to evaluate against so that it is not limited to just 1500. If you choose to create the function under a different schema, be sure to explicitly grant EXECUTE on the function to the STRM_ADMIN user. Using a PL/SQL function allows us to code the complex SQL needed to return the location_id associated with the employee_id in question. The employee_id is a column in the HR.EMPLOYEE table, so we use it in the dml_condition to pass to the function. We could use the department_id instead, but let's stick with the employee_id for this example:

create or replace function chk_emp_loc (hrid in number,
schkloc in number)
return varchar2 is get_loc number;
begin
select location_id into get_loc from hr.departments
where department_id = (select department_id from hr.employees
where employee_id = hrid);
if get_loc = chkloc then
return 'Y';
else
return 'N';
end if;
exception
when others then
return 'N';
end;
/

Now, all we have to do is use the evaluation of the function return value in the dml_condition. We create our Subset Rule as follows:

BEGIN
DBMS_STREAMS_ADM.ADD_SUBSET_RULES(
table_name => 'hr.employees',
dml_condition => 'strm_admin.chk_emp_loc(employee_id, 1500) = ''Y''',
streams_type => 'capture',
streams_name => 'capture_1500',
queue_name => 'STRM_ADMIN.HR_CAPTURE_Q',
include_tagged_lcr => false,
source_database => 'STRM1'),
END;
/

You may have noticed that the ADD_SUBSET_RULES procedure has three out parameters at the end of the procedure. These allow the procedure to report back the DML rule names created should you wish to view them after the rules are created. The procedure itself is an overloaded procedure (as shown below). If you do not need this information, you can simply leave them out of the parameter list:

DBMS_STREAMS_ADM.ADD_SUBSET_RULES(
table_name IN VARCHAR2,
dml_condition IN VARCHAR2,
streams_type IN VARCHAR2 DEFAULT 'apply',
streams_name IN VARCHAR2 DEFAULT NULL,
queue_name IN VARCHAR2 DEFAULT 'streams_queue',
include_tagged_lcr IN BOOLEAN DEFAULT FALSE,
source_database IN VARCHAR2 DEFAULT NULL,
insert_rule_name OUT VARCHAR2,
update_rule_name OUT VARCHAR2,
delete_rule_name OUT VARCHAR2);
DBMS_STREAMS_ADM.ADD_SUBSET_RULES(
table_name IN VARCHAR2,
dml_condition IN VARCHAR2,
streams_type IN VARCHAR2 DEFAULT 'apply',
streams_name IN VARCHAR2 DEFAULT NULL,
queue_name IN VARCHAR2 DEFAULT 'streams_queue',
include_tagged_lcr IN BOOLEAN DEFAULT FALSE,
source_database IN VARCHAR2 DEFAULT NULL);

Once you have created the Sub-Setting Rule for the table, you can view the sub-setting condition as well as the rule_condition generated in the DBA_STREAMS_TABLE_RULES view:

set long 1000
column rule_condition format a75;
select streams_name, table_owner || '.' || table_name tablename,
subsetting_operation action, rule_name, dml_condition,
rule_condition
from dba_streams_table_rules
where streams_name = 'CAPTURE_1500'
order by rule_name;
STREAMS_NAME TABLENAME ACTION RULE_NAME
--------------- --------------- ------ ---------------
DML_CONDITION||CHR(13)||CHR(10)
------------------------------------------------------------
RULE_CONDITION
------------------------------------------------------------
CAPTURE_1500 HR.EMPLOYEES INSERT EMPLOYEES127
strm_admin.chk_emp_loc(employee_id, 1500) = 'Y'
:dml.get_object_owner()='HR' AND :dml.get_object_name()=
'EMPLOYEES' AND :dml.is_null_tag()='Y' AND :dml.get_source_datab
ase_name()='STRM1' AND :dml.get_command_type() IN ('UPDATE',
'INSERT') AND (:dml.get_value('NEW','"EMPLOYEE_ID"') IS NOT
NULL) AND ("STRM_ADMIN"."CHK_EMP_LOC"(:dml.get_value('NEW','
"EMPLOYEE_ID"').AccessNumber(),1500)='Y') AND (:dml.get_comm
and_type()='INSERT' OR ((:dml.get_value('OLD','"EMPLOYEE_ID"
') IS NOT NULL) AND NOT EXISTS (SELECT 1 FROM SYS.DUAL WHERE
("STRM_ADMIN"."CHK_EMP_LOC"(:dml.get_value('OLD','"EMPLOYEE
_ID"').AccessNumber(),1500)='Y'))))
CAPTURE_1500 HR.EMPLOYEES UPDATE EMPLOYEES128
strm_admin.chk_emp_loc(employee_id, 1500) = 'Y'
:dml.get_object_owner()='HR' AND :dml.get_object_name()=
'EMPLOYEES' AND :dml.is_null_tag()='Y' AND :dml.get_source_datab
ase_name()='STRM1' AND :dml.get_command_type()='UPDATE' AND
(:dml.get_value('NEW','"EMPLOYEE_ID"') IS NOT NULL) AND (:dm
l.get_value('OLD','"EMPLOYEE_ID"') IS NOT NULL) AND ("STRM_A
DMIN"."CHK_EMP_LOC"(:dml.get_value('OLD','"EMPLOYEE_ID"').Ac
cessNumber(),1500)='Y') AND ("STRM_ADMIN"."CHK_EMP_LOC"(:dml
.get_value('NEW','"EMPLOYEE_ID"').AccessNumber(),1500)='Y')
CAPTURE_1500 HR.EMPLOYEES DELETE EMPLOYEES129
strm_admin.chk_emp_loc(employee_id, 1500) = 'Y'
:dml.get_object_owner()='HR' AND :dml.get_object_name()=
'EMPLOYEES' AND :dml.is_null_tag()='Y' AND :dml.get_source_datab
ase_name()='STRM1' AND :dml.get_command_type() IN ('UPDATE',
'DELETE') AND (:dml.get_value('OLD','"EMPLOYEE_ID"') IS NOT
NULL) AND ("STRM_ADMIN"."CHK_EMP_LOC"(:dml.get_value('OLD','
"EMPLOYEE_ID"').AccessNumber(),1500)='Y') AND (:dml.get_comm
and_type()='DELETE' OR ((:dml.get_value('NEW','"EMPLOYEE_ID"
') IS NOT NULL) AND NOT EXISTS (SELECT 1 FROM SYS.DUAL WHERE
("STRM_ADMIN"."CHK_EMP_LOC"(:dml.get_value('NEW','"EMPLOYEE
_ID"').AccessNumber(),1500)='Y'))))

Let's examine the UPDATE action rule_condition (we will get into rules in more depth later in the chapter): each condition must evaluate to TRUE for the rule to capture the change (Positive Rule Set). Each condition acts very much like the conditions in a SQL statement WHERE clause, and could be considered synonymous with the WHERE clause.

The first conditions are pretty straight forward. The change has to be an UPDATE for the HR.EMPLOYEE table, with a null tag, and source database being STRM1.

:dml.get_object_owner()='HR' AND
:dml.get_object_name()='EMPLOYEES' AND
:dml.is_null_tag()='Y' AND
:dml.get_source_database_name()='STRM1' AND :dml.get_command_type()='UPDATE'

The next set of evaluations, as stated in the following, check to make sure the primary key column has a value:

AND
(:dml.get_value('NEW','"EMPLOYEE_ID"') IS NOT NULL) AND
(:dml.get_value('OLD','"EMPLOYEE_ID"') IS NOT NULL)

And finally, our dml_condition evaluation. Notice the column employee_id has been extended to use the LCR$_ROW_RECORD Type nomenclature.

AND
("STRM_ADMIN"."CHK_EMP_LOC"(:dml.get_value('OLD','"EMPLOYEE_ID"').AccessNumber(),1500)='Y') AND
("STRM_ADMIN"."CHK_EMP_LOC"(:dml.get_value('NEW','"EMPLOYEE_ID"').AccessNumber(),1500)='Y')

Note

For more information on LRC Types and Member functions, please refer to the Logical Change Record Types chapter in the Oracle Database PL/SQL Packages and Types Reference Manual.

You will see more verbose checking on the primary key columns with the INSERT and DELETE statements to address additional existing record checks. It is possible for an UPDATE statement to be converted into an INSERT or DELETE statement depending on the situation.

Note

For more information on Subset rule evaluation for different DML operations, please refer to the DBMS_STREAMS_ADM.ADD_SUBSET_RULES usage notes in the Oracle Database PL/SQL Packages and Types Reference Manual.

The resulting Capture should capture only those changes for location_id 1500 that can then be propagated to the Location 1500 database and applied.

Subsetting — the micro side of replication

Because Sub-Setting is at the table level, you will want to do this for every table in the HR schema for which you wish to subset data by the location_id. The previous example is just one such example.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset