You have gone to the effort of implementing the right tool for the job; a DataStore, not a staging cube. Generating true delta records in the data model has improved the nightly load window. Using the fundamental feature of a DataStore, the change log now contains a simplified ‘action list’ of real changes, eliminating all redundant records from the data flow.
You can throw as many full loads at the DataStore as you want and it will still only output a simple list of true changes. This is the primary advantage we want to leverage from using a DataStore in the staging layer and transformation layer of the data model.
It does not matter how much we redundantly re-process the data, over and over again in the transformation layer. The majority of data models everywhere can be improved by utilising a DataStore for this primary advantage.
“a rogue record is created when an
extractor fails to identify deleted records”
There is a worst case scenario where there is a DataStore and two transformations that continually flip the value of an InfoObject back and forth. For Example: One transformation says it has a value while the other says it is an initial value (blank). Say ‘Hi’ to the world of time dependant attributes (mostly made of sparse DataSets) and master data ETL that is combining many DataSources into a single DataStore.
The objective being a ‘single table of truth’ for all attributes of a characteristic. The 0EMPLOYEE, 0PERSON and 0ORGUNIT characteristics each contain multiple DataSources that need to be merged into a single, comprehensive master data characteristic with attributes.
In this worst-case scenario where value flipping is occurring, there will be no benefit to the data flow through efficient change log generation. So please take the time to ensure the transformation field mappings are updating a single target InfoObject from only one DataProvider.
The primary advantage highlights itself when the DataSource is unable to deliver true delta records. For Example: A budget, plan or forecast extractor. The DataSource can happily re-load the full DataSet into the DataStore DataTarget, over and over again, leveraging the ability to detect changes to records and ignoring records that are still the same. There is only one scenario that this ETL strategy cannot detect, deletions; records that have been completely deleted from the source system. The record was there yesterday, it is now in BW but it is no longer in the source system. These are ‘Rogue Records’.
There are essentially two main ways to handle a rogue record:
- Delete the record;
- Flag the record.
Deleting a rogue record is simple enough. Once detected, change the 0RECORDMODE value to be ‘D’ and then activate that record into a down stream DataStore. The delta handling mechanism of all down stream DataTargets will clean up that record across the entire data model.
This is clean, efficient and will remove all impact of those records key figures from the reporting queries. The only time this could be an issue requiring further attention is when the DataSet is exported to an external system via flat files, open hub destinations, business objects tools, databases, custom ABAP code or DataMart into another BW system.
Flagging a rogue record can be a little more complicated as there are several ways to handle the record:
- Flag the record as deleted;
- Flag the record as deleted and set all key figure values to zero;
- Flag the record as reversed;
- Flag the record as reversed and negate (* -1) the key figure values;
- Flag the record as cleared.
Notice the two distinct actions identified.
First: The flag is aligned with a business or technical status. This will impact the characteristics and therefore the query navigation and filtering.
Second: The key figure values are left untouched, zeroed or reversed. The appropriate technique to implement in the Transformation should be driven by the reporting query impact required:
- Statutory reporting. Track changes but never re-write history;
- Management reporting. Just show me the final impact to the results;
- Operational reporting. Showing the details of the current results;
- Audit and reconciliation reporting. Show me every change, in sequence;
- Stock reporting. Show the difference between ‘Never Processed’ versus ‘Had movement but is truly zero’.
While there are a few options to choose from, it is better to implement an option that will guarantee the current query results will report the appropriate totals for the key figures. This is in line with the best practice of keeping the DataSet safe for all possible navigation slices.
Flagging a record as deleted without doing something to the key figures is placing the burden of ownership on the query designer. This is not recommended.
When the key figures are also set to zero, the DataSet will at least ensure all current queries will report the accurate, current total value. If your query really needs access to the original value prior to the record being deleted then it is advisable to add a new specific key figure to the data model that holds the deleted, reversed, cleared values in parallel to the key figure.
This technique is also known as key figure modeling, as opposed to account based modeling. For Example: 0AMOUNT will get a new sibling key figure with the same definition ‘Amount Reversed (ZAMOUNT_R)’. This can be useful in finance audit reporting when reviewing the operational postings of ‘Nominal Accounts’.
A clear understanding of the above choices is required before implementing the technique of detecting rogue records. Find a full load in your BW system and identify the data modeling requirements:
- DataStore 16 key limit reached? Concatenation or surrogate characteristics required;
- Will rogue records be deleted or flagged?
- Flagged rogue records will use which characteristic and value?
- Flagged rogue records will use which key figure technique and apply what mathematic calculation to the value?