In the last few years, analytics has evolved significantly. We've moved from the adoption of big data technology is all about Hadoop and Spark, to an increased focus on machine learning algorithms, deep learning and artificial intelligence (AI).
Sharing our Innovative Ideas and Expertise
While the vision for data lakes has always been focused completely on making data more quickly available, few companies have managed to meet the challenge of satisfying the needs and of business end users as a central focus...
Maurizio Colleluori looks at the five major reasons behind data lake failure, pinpointing what businesses need to do to get back on the path to success. The reality is that data lakes are failing to support the time-to-market requirements new analytics-driven innovation requires, and it is safe to say that in many companies, data lakes are widely perceived to be expensive and ineffective. So why is it happening? In this article, we look at some of the common culprits turning data lakes into data swamps, and at the same time deliver advice based on experience to help companies from experiencing data lake disasters.
Over the past several years, forward-thinking companies have been creating custom engineered data lakes in order to store large volumes and different varieties of enterprise data in an efficient way. To get the job done, many companies have tried to use complex, custom-engineered and Hadoop-enabled open source solutions in-house. However, while the software may be free, the engineering expertise this approach requires means that most companies are looking at multi-million dollar investments right from the start.
For many companies, working with data lakes has become a frustrating and unsuccessful experience: instead of being focused on building analytics and improving the quality of the data lake, engineering teams often spend most of their time dealing with requests to ingest new data sources or wrangle data. As a result, they have little time left to focus on data improvement and delivering value from big data analytics. Using Kylo, our open source data lake management platform, companies are able to generate valuable insight from their data lakes faster, bringing innovation via products and services to the market at unparalleled speed.
Think Big's enterprise-ready, open source data lake management platform, Kylo, is built on Apache Spark, NiFi and Hadoop to help organizations get the most value out of their data. Using Kylo, Think Big has been able to help customers to integrate and simplify pipeline development and data management tasks, resulting in faster time to market and greater user adoption. From data lake build struggles, to building complex transformations in just 9 weeks instead of months or even years, Kylo enables organizations to significantly reduce costs across the board.
Hadoop is difficult to get right, and most organizations will freely admit they don’t have the in-house engineering skills to successfully implement big data solutions on the Hadoop stack. In fact, at the recent Gartner Data & Analytics Summit in Sydney, Gartner research director Nick Heudecker claimed that 70 per cent of Hadoop deployments in 2017 will either fail to deliver their estimated cost savings or their predicted revenue.
Originally featured on O’Reilly. Performing business analytics on the data lake using next-gen open source tools. “This world is increasingly being driven by quantitative analysis, not qualitative. More than ever, corporate roles involved in decision-making corporate roles involved in decision-making need to have access to data and be able to make sense of it.” – … Continued
Five reasons to join Think Big and see your career skyrocket The growth in popularity of data science as a subject, coupled with the global boom in data creation, and organisations looking to maximise information through big data strategies, means that more and more people are building a career in this space. However, given the … Continued
Last week, we caught up with Jon Gleich, Principal Data Engineer at Think Big, to drill him on everything from a typical day (apparently there’s no such thing), to a data engineer’s requisite skill set, and the value of the role in customer terms. Based in Dublin, we spoke to him in Copenhagen, where he’s … Continued