Propagate change details to the Data Warehouse, and using CDC_get_changed_columns()

This Use Case is applicable when a Data Warehouse (DW) table mirrors, to a significant degree, the source table S, and contains some subset of the columns in S. In this case, the change values to send to DW are driven by detailed examination of each Change table record in a transaction. This use case also shows how the SQL CDC function, CDC_get_changed_columns() can be used to understand and process change Data.

The change data values are accessed in 2 steps
  1. To get the source table number
  2. To get the changes from the Change table.

Step 1 is the same as the example in Get change data values for a source table.

For example, if the Change table contains 12 fields being tracked from the source table, with names, in addition to some key Identifying fields:
  • fld1, fld2, …, fldn
  • ID_f1, ID_f2, ID_f3 – identifying fields.

Step 2 gets Change data value from the Change table, in the change sequence order. Changes must be send to the Data Warehouse in the change sequence order so that the Data Warehouse data reflects the corresponding rows in the source table with accuracy.

The following query uses the SQL CDC function, CDC_get_changed_columns(), to get a list of the columns changed on an UPDATE operation. INSERT and DELETE operations always have Change data for every change-tracked field, as well as for Identifying Fields.

select fld1, fld2, fld3, fld4, fld5, fld6, fld7, fld9, fld10, fld11, fld12,
       ID_f1, ID_f2, ID_f3,
       "_Operation",
       CDC_get_changed_columns (pub.CDC_Customer, _Change-FieldMap) as gcc
from   pub."_Cdc-Change-Tracking" ct
inner join
       pub.CDC_Customer c               -- assumed Change table name.
          on ct."_Change-Sequence" = c."_Change-Sequence"
where  ct."_ Source-Table-Number"    = ?          --from 1st query.
and    c."_Operation"    != 3   -- optional - exclude Before Update images.

order by ct."_ Source-Table-Number", ct."_Change-Sequence";
-- Order By keys map to index _Sequence-Id
Note:
  • Sometimes the data sent to DW contains a small amount of inconsistency due to incomplete transactions. No matter how Change data is selected, time-bounded or not, it is always possible that the selected changes contain incomplete transactions.
  • Transaction completion is not written to the Change data.
  • As all Change table data is committed, data values are always valid, even if inconsistent with respect to transactions.
  • This query retrieves all the tracked fields in the Change table. The fields that do not contain changes, and do not contain Identifying fields, will have the Null value in the Result Set.
  • The application learns the names of the columns with changes from an UPDATE from the value of the function CDC_get_changed_columns().
  • Insert and Delete operations write all change-tracked column values to the Change table.
  • Some ETL applications might want to get only change for certain operations, such as INSERT or DELETE. This can be easily accomplished by adding a Where clause predicate which chooses the desired values of “_Operation” in the Change table.

This query could be embedded within application logic as shown in the following example:

do
      query #2            -- this is query above, in Step #2.
      exitloop if no data
      Let RS = result set from query #1
      Let stmt = 
      switch _Operation 
          case Create – Build Insert stmt for fields 1..12, 
                            plus Identifying fields, 
                            with values from parameters
                        Set parameter values for Insert from RS data
          case Delete – Build Delete stmt with 
                            Identifying fields = parameters
                        Set parameter values from RS ID_f* fields
          case Update – Build partial Update for DW table
                 for each field f in gcc:          -- changed columns
                     Build set f = parameter, remember parameter
                 Build Where clause for ID_F* fields = parameters
                 Set parameter values for “set” from RS data
                 Set parameter values from RS ID_f* fields
      end switch


      exec sql Prepare of generate stmt
      -- set all parameters for generated stmt
      switch _Operation 
          case Create – Set parameter values for Insert from RS data
          case Delete – Set parameter values from RS ID_f* fields
          case Update – Set parameter values for “set” from RS data
                        Set parameter values from RS ID_f* fields
      end switch


      -- Now execute prepared stmt to send changes to DW
      execute generated stmt

enddo

The UPDATE statement that is generated for a particular result set row might look like the following example:

UPDATE  pub.my_target_tbl
SET     fld1 = ?,  
        fld2 = ?
WHERE ID_f1 = ? AND ID_f2 = ? and ID_f3 = ?