Big data in banking – Hadoop or not Hadoop?

Comments (0)

Originally featured in CIO.

Five years ago, the big question for banks was simple: To Hadoop, or not? Fast forward to 2016 and the question has changed completely. All banks these days are to some extent invested in Hadoop, but let’s be clear: no one bank has truly crossed the chasm with Hadoop to apply it as a core technology. In banking’s long computer-based history, there are vast architectures and legacy systems to think about, and all Tier 1, 2 and 3 banks still opt for proprietary solutions as a mainstay.

However, Hadoop is getting closer. Despite its complexities, there are several forward thinking and leading banks that are applying Hadoop to meet the many regulatory and volume based challenges presenting themselves in financial services today. From a niche product processing point of view, Hadoop use is on the rise, and when it comes to big data processing and analytics, nothing beats Hadoop in terms of price, capacity and speed.

While Hadoop’s current application may be siloed and restricted to specific ‘innovation’ or boutique departments, banks need not despair: Hadoop does not need to be a bank’s core technology in order to be useful, especially when it comes to big data.

A Compliment to Compliance
Banks are using Hadoop as one of the only ways to pull together the vast amounts of data required to prove compliance. Whether it’s Dodd-Frank, Basel III or MiFID, compliance is expensive, and banks are finding its’ 80-90% cheaper starting price attractive in comparison to the proprietary solutions on offer. Beyond this, Hadoop is a great way to store data in a way that massive data warehouses just can’t do, and it’s lightning fast analytics capabilities are proving to be a match for the mammoth task compliance has become.

Uniting the Fragmented Regionals
There are banks that are aspiring to use Hadoop more widely, and these are those that have grown entirely by acquisition, buying up smaller regional banks. While these banks are still mainly invested in traditional data warehousing, they are making the right noises when it comes to investing in one consistent set of technologies to link up what may still be rather disparate regions. The need for this type of bank to get a core technology in there is likely higher because of the number of different core systems in place, and possible incompatibility.

For these banks, there is less to lose in making the move to Hadoop. Hadoop will do about 80% of what they need at about 2% of the cost of off the shelf solutions, and the remaining 20% can be built to custom specifications.

Worldwide Real-Life Stories
In times of banking crisis, Hadoop is really proving itself. When the Panama Papers leak hit the news back in April, banks had a lot to prove from a compliance point of view. The requests for offshore companies for clients quickly became common knowledge, and as regulators prepared to come down hard, banks had their work cut out for them in terms of proving business legitimacy.

Big data has always been part of the war against money laundering and other fraudulent activity, but this time banks had to move fast in order to prove legitimacy or risk being heavily fined by regulators.

Using Hadoop, banks were able to streamline the process of ingesting the over 11.5 million files included in the Panama Papers, and using Hadoop-driven data analysis tools, they were quickly able to identify compliance or gaps. In one instance, Hadoop help cut the customer ID matching processing time in half, reducing the time it took to deliver data to the business from six months to hours – and at half the cost of a proprietary solution.

A Last Thought
The examples above are encouraging because they show a true evolution that has happened in the financial services world as it seeks news ways to keep up with a unique set of challenges. As banks’ trading remits expand to conquer new regions, global transaction volumes are set to double in the next two years, and all this comes with a heavy-handed dose of unpredictability in terms of the new types of fraudulent activity that will emerge. Simply put, banks need to consider new, faster and cheaper ways to get to raw data faster, and ways to analyse it both better and faster. The power of Hadoop data lakes, for example, would enable users from an unlimited number of banking departments to easily access and analyse data and quite simply, do banking in a better way.

If banks can begin to increase Hadoop use and harness its power, they will benefit from its cost efficiencies and be able not only to focus on new opportunities but to be able to deliver new products and services to customers, faster than ever before. The early adopters will become winners, and the laggards face a bleak future. The question no longer is Hadoop or not – it’s now this: Will banks be willing to become more invested before their competition does and enjoy the benefits of what is still low hanging fruit?


Leave a Reply

Your email address will not be published. Required fields are marked *