Surrogate Characteristics to Trim Records

Use of a surrogate characteristic in the staging and transformation layers will lead you to the idea that the original InfoObject, that is redundantly stored in the DataStore data field list, could be deleted from the transaction data record. This would offer even more space saving if the surrogate was representing a combination of several InfoObjects.

Surrogate Characteristic

You could utilise the attributes of the surrogate characteristic to store the real master data values in the original InfoObjects that are to be deleted from the transaction data record.

“Surrogate Characteristics have a direct¬†financial
impact in a Production systems’ long term costs”

While space is not at a premium, it is recommended to keep all concatenated and surrogate characteristics redundantly stored in the DataStore data field list in their original InfoObjects. A few reasons why:

  • Easier to reconcile data issues as the real master data value is available in the record (Readable);
  • Faster attribute lookups because you are not being forced to use transitive attribute relationships;
  • Cleaner ABAP code, in all aspects, due to the use of real master data values;
  • Use of DIMs and SIDs has occasionally been corrupted due to program bugs. Redundant data field makes this easier to resolve.

Should you commit to trimming a transaction data record and rely more on the surrogate characteristic with no redundant copy of the original InfoObjects in the DataStore data fields, please make sure you do not do this to the very first staging layer in your data model. The first staging DataStore within BW has a special purpose:

  • Reconcile raw data with the source system using the real field values;
  • Can rebuild the entire downstream data model without going back to the source system;
  • Holds unconverted data before the transformation layer enhances it .

Even when space is at a premium and concatenated and surrogate characteristics are being used in the first staging DataStore, please make sure you keep the redundant original InfoObject in this first staging DataStore. By all means, the original InfoObjects in the DataStore data field list can be dropped from the second staging DataStore in the downstream data flow. The above benefits cannot be overstated.

Look at it this way. Have you ever experienced the frustration of not being able to re-run a source system extractor because it is month-end and the source system administrator will not let you execute during business hours because of the impact to business-as-usual processing? This is even in light of a board meeting starting first thing tomorrow morning and the extractor still needs 6 hours to re-initialise. Guess who is not going home on time tonight? The BW Administrator. Guess who is losing sleep and going to have to come in early to do a reconciliation? The BW Administrator and the Business Analyst.

Now take a step back and ask a few obvious questions:

  • Is all of that overnight effort necessary?
  • Are we 100% sure that the re-initialisation is truly required?
  • Can we prove the reporting issue is a BW data model or source system issue?
  • Can we prove the problem is missing data in the source system?

All of these questions and more lead to a common BW Administrator activity: In which system did the data problem first appear?

The answer to this question is always found in the first staging DataStore within BW. With the redundant copy of the original InfoObjects stored in the DataStore data fields, a Business Analyst can use a ‘Simple’ reconciliation query to prove to the BW Administrator that the identified reporting issue does or does not involve the extraction of data across to BW.

Approaching the reporting issue this way is advisable because it allows the Business Analyst to get involved. Since they should understand the data better than the BW Administrator, it makes sense the onus should be on them to prove that the reporting issue is not a ‘missing source system data’ issue. For Example: Who posted that manual journal at 4pm, right before closing? Who maintained the time-dependant master data attribute and back dated it 3 periods? These are all good examples where a BW Administrator is often asked to prove why the BW/BO report is wrong when it has nothing to do with the Data Warehouse for reporting.

The accurate placement of ownership and accountability can be directly traced back to the availability of the raw field values in the first staging DataStore. When enhanced with simple reconciliation queries and a dedicated Business Analyst, the BW Administrator can be left alone and only contacted once the business can prove it is not a source data issue.

  • Does your data model utilise a clean staging DataStore layer for source system reconciliation? With reconciliation queries?
  • Where could you be utilising surrogate characteristics?
  • Is database table space at a premium?