I’ve posted two articles on the challenges of systems integration, firstly on the advantages and disadvantages of Best-of-Breed and Fully Integrated Solutions, and secondly on how to approach integration if you’ve chosen Best-of-Breed.
See Systems Integration and Reducing the Risks.
This post is concerned with how software should be developed if integration is anticipated.
You can think of it in this way:
Systems respond to EVENTS by executing software PROCESSES that update DATA.
In a single ‘fully integrated’ system the ‘complete’ response to an EVENT is contained entirely in one system and is reflected in one set of DATA. ‘Fully integrated’ systems can thus always ensure a complete response to an EVENT or no response, but are rarely left in an incomplete state resulting from partial processing. Typically, databases roll back incomplete PROCESSES, and all the consequent DATA changes, if there is a failure at any stage. It’s all or nothing, if you want it to be.
Integrating best-of-breed systems, working with more than one database, makes it more difficult to achieve this kind of integrity. The failure of a PROCESS in one system can’t easily trigger the roll-back of a PROCESS in the other.
Integrating separate systems requires therefore that you should easily be able to identify the success or failure of each part of a PROCESS and identify and control each set of DATA changes that each PROCESS effects.
For example, suppose you’re a company that distributes imported goods and you’re obsessive about calculating the margin you obtain on each product (this may be for good statutory tax reasons, or simply because you are unnaturally pedantic). The cost of the goods you sell is determined by the price you pay for them, the duties you pay on importing them and the cost of transporting them to your warehouse. You only know the true cost of each product once you’ve received the invoices that make up all of these costs. But, in the meantime, you may have shipped them.
Let’s consider an EVENT such as the arrival of an invoice for transportation. If the calculation of ‘actual’ costs is automated then your system must carry out the following PROCESSES:
- Invoice Registration: The invoice is booked against supplier, tax and cost accounts.
- Accrual Reversal: If an accrual for this cost has been made earlier it must be reversed.
- Stock Value Correction: The difference between accrued and actual transportation cost must be calculated and allocated against all the stock items it is related to.
- Cost of Goods Sold Correction: If there are items to which this difference in cost relates that have already been sold, then the appropriate portion of this difference must be allocated to a cost of goods sold account rather than to an inventory account.
Each step involves updating transaction tables, and if all these PROCESSES are occurring in a single integrated system then either they all succeed or they all fail, and the database either reflects a complete response to the EVENT or no response at all.
But if your finance system is separated from your manufacturing system then some of these PROCESSES must be carried out in one system and some in the other (and parts of some in both). It’s likely also that they will take place at different times (if there is ‘batch update’ of one system by the other rather than immediate update). It doesn’t matter for the moment which. In consequence it is possible that taken together the systems may reflect a partial response to an EVENT.
When developing systems that must be capable of handling partial failure in response to an EVENT the following approaches are wise:
- Ideally there should be an explicit database table that identifies for each uniquely identifiable EVENT the PROCESSES that have been completed successfully and which are therefore fully reflected in the system’s database. Ideally such tables are present in the system that is sending and the system that is receiving. In reality, it is the state of the data that implies whether a PROCESS is complete, and it is rare to find an explicit table that records this separately, but it would certainly be useful. However the success of each PROCESS is recorded, the changes in data associated with each PROCESS should be easy to identify.
- Transaction data that have been processed by an interface program must be marked as such.
- All modifications to data that imply additional interface processes should be handled with a full audit trail. Ideally a modification to a transaction involves the insertion of both a reversal of the earlier transaction and a new transaction.
- All transactions should have a field that identifies the order of events. Ideally this should be a date/time stamp. This enables execution of interfaces in the correct sequence, as well as the unwinding of PROCESSES if required.
If these safeguards are observed reconciliation and correction should not be too difficult.
System integration isn’t easy and there is no reason to believe it will get easier. There are still no standards for the description of data to which all system developers and integrators can refer. This is no surprise. Data are not objective objects which can be explored and described without reference to purpose, context and culture, and they never will be. They are not like the objects of material science; they are human artefacts.
Over the last decades I have witnessed a number of false dawns with regard to ‘easier integration’. These usually turn out to be useful advances in technology but rarely make functional integration any easier.
Two weeks ago I read an article on Data-centre Software in The Economist (Sept 19th 2015). That’s what made me think about integration issues. The article mentions Docker, a successful startup that’s making it easier and more efficient to move applications from one cloud to another. Laudable, clever technology, but, as far as I can tell, it represents no progress in the more difficult task of integrating business processes. Those of us who do such work can be assured of work for many decades to come.