By David Lyon
It hasn't been a good week for Stephen Hester.
Not only has RBS had their credit rating downgraded by Moody's, but he's now managed one of the biggest failures in both modern-day banking and IT.
For anyone unfamiliar with recent events, RBS have suffered a system fault that has resulted in hundreds of thousands of customers across RBS, NatWest and Ulster Bank being unable to access recent deposits in their accounts for an unprecedented 7 days.
The itchy glitch
In recent announcements, Hester has refered to a "glitch" in their system that has ultimately caused the issue(s), giving no indication of when things will return to normal, but assuring us that things will be restored as quickly as possible.
Not since Dick Jones' demonstration of ED-209 in Robocop has the word "glitch" been so inappropriately used.
In more real terms, what RBS have experienced is a catastrophic systems failure, which has then caused a cascading systems failure. This is where a single unrecoverable error occurs, causing an initial critical system to fail, and then has an equally show-stopping effect on other systems dependent on it.
To put it plainly, in an organisation as large as RBS, they are now performing disaster recovery. They will recover from it, since very rarely in IT does a situation ever reach the permanently unrecoverable stage... for the recovery to take longer than a week on customer-impacting frontline services though, is an absolutely huge failure on their part.
Needless to say, it should not have happened.
The initial problem should have required so many failures in so many redundant backups and secondary systems that the probability of it happening becomes astronomical. The subsequent reversal of the change that caused the problem should have taken hours, or a day at maximum.
However... they managed it. They have created the biggest failure of a "modern" financial system in UK history, spanning 7 full days and affecting customers across the country without any clear plan or set expectation of when service would be restored. In the meantime... house purchases have fallen through, flights have been cancelled, business deals have evaporated, bills have gone unpaid. All that's missing is a plague of locusts or a Monty Python foot from the clouds.
For many people, life has been completely and in some cases irreparably disrupted.
Beyond the technical severity of the situation though, and even beyond the suffering of individual account-holders... there is a further potentially worse outcome still looming: The possibility of a bank run.
Northern Rock was the last bank to have that dubious honour, with customers panicking and withdrawing so much money in such a short time that the bank was bled dry.
A bank run on RBS, which is 84% owned by UK Financial Investments Ltd, which in turn belongs to the UK Government, wouldn't end well. This may explain why our apparently free-market-driven media have all coincidentally decided to limit their coverage of the chaos, reporting only the vague details officially released by RBS themselves. All it will take at this point is a social panic to snowball this into a circus.
So... What happened?
Media reports are beginning to circulate that a large factor in the situation has been the relocation of much of the RBS IT department to off-shore centres.
The source of these reports are comments being made on other articles claiming to be ex-staff of these departments, giving insight into what actually happened. This was posted onto TheRegister late yesterday evening:
I'm [anonymous] for very obvious reasons having been one of the recent 1000+ to find their roles now being done from Chennai, however I have been speaking to a few ex-collegues who are still there and can confirm that they say the same as the above poster as in a CA7 upgrade was done, went horribly wrong, and was then backed out (which will have been done in typical RBS style - 12 hours of conference calls before letting the techie do what they suggested at the very start).
My understanding is that most if not all of the batch team were let go and replaced with people from India and I do remember them complaining that they were having to pass 10-20+ years worth of mainframe knowledge on to people who'd never heard of a mainframe outside of a museum.
The Indians were keen and willing to try and learn, but with out the years of previous experience will now be deep in the smelly stuff. The only good thing is being the batch and over night processing that failed, all the data will still be in the system awaiting processing so no one should find they money going missing as a result of this incident.
A "glitch" indeed...
CA7 refers to CA Workload Automation CA 7 Edition, and is an Enterprise batch processing solution.
The RBS system is a patch-work of different systems developed between 1960 and 2012. Although many of the services offered today are virtually instant, with processing done on demand, there are still legacy systems in place that require overnight batch processing to handle older non-instant transactions. Once each evening, the system goes offline and processing of the day's batch workload begins throughout the night.
What appears to have happened here is that the software upgrade went horribly wrong and overnight batch processing has failed. This is why customer accounts have not been credited with payments and balances have been incorrectly displayed.
As a result, this original problem has now spawned a series of cascading issues, both technical and non-technical, that are arguably worse than the original issue. The data that should have been updated is there... it just hasn't been processed. It exists in a back log that should be dealt with swiftly once the batch processing facility is repaired.
Since RBS laid off 1,600 long-term UK IT staff and replaced them with 800 well-intentioned yet horribly inexperienced staff off-shore... you can begin to see why this has taken days instead of hours to fix. Understanding the quirks, foibles and behaviours of a system that has grown over 3 decades, such a mish-mash of extremely old and new systems would bring any new IT technician to their knees in a hurry, no matter how keen.
However, even with these explanations, some aspects of the scenario do not make sense:
Answers to these questions would be speculative at best.
Regardless, the total financial cost to the bank of this debacle has been estimated at £10m, between lost business, extra staff and throwing money into the IT furnace to try and fix it.
The real cost
... will be far greater. Mainly paid in the public's trust (and money), or what was left of it.
The banks have been slowly and greedily twisting the public's wrist for a long time, behaving more like casinos than the responsible institutions they advertise themselves as. It has gone on for so long as to be almost an accepted part of the banks' culture.
Under our present system, If you make a single wrong move, you are penalised to the tune of £6.00 per day with only the bank's own arbitrary rules cited as justification. There is no humanity, no empathy and no remorse. It is an amoral House who always wins. The money does not go to any good cause; the bank takes it because they say so.
When you stop and consider that the House is floating on public money while doing this... up is down and right is wrong.
It is legal, apparently... except the Unfair Terms in Consumer Contracts Regulations 1999 expressly forbids a contract that is unbalanced to the detriment of the consumer. This has never been tested in court mainly because nobody can afford it. The Office of Fair Trading focused on another point of law entirely during their highly-publicised defeat.
The specific text reads:
5.—(1) A contractual term which has not been individually negotiated shall be regarded as unfair if, contrary to the requirement of good faith, it causes a significant imbalance in the parties' rights and obligations arising under the contract, to the detriment of the consumer.
Since the bank does not spend £6.00 per day when a customer goes into the red, the bank profits from this arrangement each and every time it happens, no matter how minor the impact with no justification cited. The customer, when the bank makes an error, is renumerated to the tune of what they lost and no more. Justification is required to receive more, and the decision still lies with the bank.
For those who don't know the million-or-so quirks of how the system works... simply having and using a bank account can be a financial risk. The cost when you make a single error can be huge, unfair, and frequently devastating to a person's or family's wellbeing psychologically as well as financially in the long-term.
As with any good casino, there are no second chances.
RBS will want a second chance here, though... again. They will try and downplay the severety of what has happened, using words like "glitch" and "mistake" in place of words like "failure" and "catastrophy". When the public downplays the nature of their simple mistake, the one that cost them a week's wages, they are treated to a robotic reply indicating that the bank does not care, only the rules of the game matter.
If the bank followed their own rules in this instance, and ensured balance to the contract, they would pay £6.00 to every customer for every day they have been affected by the problems, plus compensation for any distress caused by having no access to their money for over a week.
Yeah... I'll hold my breath.
When it's all over, one thing is certain: Stephen Hester has proudly carried the torch of failure that Fred Goodwin passed onto him.
Like Sir Fred, if he was trusted to begin with (which is debatable), he certainly won't be again. Unlike Sir Fred, he did not produce any success before skipping straight to epic failure.
After this, the RBS brand, if it has any dignity left, will almost certainly be rendered as toxic as the debts that caused its downfall.