DevOps For Data Science: Why Analytics Ops Is Key To Value

Share
Comments (1)

Originally featured on Forbes.

It may be a stretch to call data science commonplace, but the question “what’s next” is often heard with regard to analytics. And then the conversation often turns straight to the Artificial Intelligence and deep learning. Instead, a tough love review of the current reality may be in order.

The simple truth is that, as currently configured, data-centric companies will struggle to cross the divide between what is currently considered effective data science and a modality where analytics is an inherent part of the fundamental fabric of business operations that benefits from continuous improvement. Today, data science is all too often a process where new insights and models get developed as a one-time effort or deployed to production on an ad hoc basis, and require regular babysitting for monitoring and updating.

This is not to imply that companies are not on the right path with their data science initiatives, but merely acknowledgement that the steps they have taken thus far have brought them to the edge of a chasm they will have to cross. To the credit of more progressive organizations, creating an industrial-caliber data lake to store a lot of data of varying forms is an essential, foundational step. Building on that, the development of systems of data democratization that provide ready access to data for those seeking insight is critical. There’s no doubt that companies that have achieved those two steps already reap benefits.

Nevertheless, that’s as fas as most have come, and more significant for the future, that’s all they have prepared themselves to accomplish. Today many companies have the data and data scientists who are equipped to do analysis and build models that can be carefully engineered to plug into some usable business application. But every deployment of a model is a custom, fragile, one-off job and ensureing quality of models is done as a fragile, manual effort. Change the model and the whole thing needs to be rebuilt. Often useful analyses are performed once but can’t be reproduced or even worse, get recreated periodically but inconsistently. And if a new version model doesn’t work, well it can be a painful struggle to restore a previous version, let alone have systematic testing of models to continuously improve them.

It’s not enough to know how to wrestle with raw data. Companies need an infrastructure capable of continuously testing and improving models, starting with governed, understood analytic data sets as input. This is an environment in which normalized data lets one do any kind of data science as any time.

It’s been done before. Something similar took place in applications development and IT with the notion of DevOps, where the disparate realms of software engineers and IT operations staff now collaborate on a single process of software creation and deployment.

That came type of dexterity will be essential not only for future data-driven opportunities like AI to become reality in business, but it is critical to realizing ROI right now in today’s data environment. A company’s data science team may excel at finding the right signal in the data and can apply those findings to a process, but they are not equipped to maintain that data product once it is released in the wild. IT engineers expect something that is more refined and ready to deploy. Between the two is a gap.

What’s missing is the internalization of a new business discipline – analytics ops – that turns analytics from lab-coated, sequestered sceince experiment into a consistent methodology for integrating data science teams, engineering teams, and a framework for building analytics models into something that is readily and continually digestible at an operations level.

Analytics Ops is the difference between focusing on resource-intensive one-off victories and having a constant, adaptable source of nourishment. To get there, companies will require cross-functional teams with the right software and discipline to enable data scientists, engineers, product managers and domain experts to all work together to create a continuous cycle that drives value to the business.

This next step starts by balancing spending and organization development so that there is some level of investment in Analytics Ops to bridge the divide from data science to IT engineering. Without this forward-thinking approach, companies are going to end up with really interesting analytics projects that work for a time, but eventually wither, become less relevant and cannot evolve. Most frustrating of all, companies will not get the ultimate return in terms of implementation and deployment that they expect from their analytics investment.

The next step forward in analytics is not going to be driven by data scientists alone. It requires an investment in skills, practices and supporting technology to move analytics out of the lap and into the business. Analytics Ops will involve a conscious decision to continuously integrate, test, deploy, monitor and adapt analytics within an uninterrupted, ongoing cycle of improvement. Analytics, no matter how sophisticated, needs to be seen not as a project with an end, but something that is an integral part of the framework of the entire operation.

One response to “DevOps For Data Science: Why Analytics Ops Is Key To Value

Leave a Reply

Your email address will not be published. Required fields are marked *

X