Guest blog post by Daniel Mayo, practice
leader, financial services technology, Ovum
Whilst the specifics of what went wrong with RBS last week are still
unknown, we have a good indication of the nature of the fault – a relatively
minor update to the batch scheduling software failed. This is something minor
on its own and easy to fix, but the problem was unfortunately exacerbated by an
employee deleting the scheduling queue in his attempts to fix the problem.
Rebuilding the scheduling queue is a much lengthier and more complex
process – one complicated even further by UK banks relatively large reliance on
legacy systems. RBS needs a team that has a detailed understanding of the
scheduling order, the core system’s processing quirks, and knowledge of older
IBM assembly languages.
Assessing this scenario therefore shows two of the IT infrastructure
pressures that banks face today:
First, that the shortage of skilled staff experienced in older systems
is a growing operational risk that is difficult for banks to address. Senior staff with the knowledge necessary
to perform complicated operations inevitably retire and new IT professionals
(unsurprisingly) concentrate on newer technologies.With most banks under heavy
cost pressures, relatively junior staff are often given responsibility for
systems where they have little experience beyond the routine, particularly in a
stress situation (as with RBS) where things go outside normal operations. This can become particularly acute in
situations where maintenance of systems is outsourced or
offshored, as even documentation on these systems and the kind of processes
supported is hard to come by, if it even exists at all.
Secondly, the growth of mobile banking will increase pressure to reduce
batch window and increase transaction volumes, further reducing room for error.
The batch window largely
operated fine in the old world of restricted-hour branch-based banking, where
branches closed at 15:30 and at weekends. This gave IT a large “batch window”
to complete processing, with time to roll-back and re-run if necessary. However
in an age of online banking, and with growing uptake of mobile banking, IT is
increasingly under pressure to reduce system offline time, and is being asked
to run batches within a relatively tight window. This results in less room for
error if things do go wrong.
In the short-term, the main
response by banks will be to focus on processes and governance, to ensure that
disaster handling policies are understood across all support staff. This is
appropriate. However, this glitch should be a catalyst for banks to take a
longer look at their core system strategies. While legacy systems may be mature
and stable, at some point old age will get the better of them.
No comments:
Post a Comment