What is Apache Spark?

Started in 2009 by University of California Berkeley’s AMP Lab, Spark was open sourced in 2010 and entrusted to the Apache Software Foundation in 2013.

 

Over the past five years, Spark has grown in capabilities including the ability to process streaming data; additional machine learning libraries, graph computation for connecting the dots between various events, people, or points in time, and the ability to query large data sets using SQL.

 

Spark is often used—in conjunction with a data lake—as an engine for product recommendations, predictive analytics, sensor data analysis, graph analytics and more.

 

Think Big and Spark

Organizations are experimenting with Apache Spark; however, they are finding Spark is not easy to use. That’s because the skill sets and experience to effectively use Spark are still relatively rare.

 

Think Big can help customers with expertise, frameworks and time-saving templates to help build out Spark implementations. With a history of successful engagements, Think Big has gained valuable experience on what actually works—and what does not with Spark.

 

Data Lake Services

Think Big was founded in the big data space. As one of the first big data services companies to adopt, use and implement data lakes with Spark, we have developed skills and expertise to guide implementations to success. Learn more about Think Big’s data lake offers.

 

Managed Services

Think Big provides managed services for Spark in the areas of platform and application support. We use well defined processes, robust tools, and most importantly, experienced big data experts to cost effectively manage, monitor, and maintain this open source platform. Learn more about our Managed Services.

 

Data Science on Spark

Data science is about gaining value from your big data investment. Think Big specializes in big and high velocity data using our experienced consultants to provide predictive analytics for clickstream, Internet of Things, customer insights, content recommendations and more. We offer model lifecycle management from data investigation to model promotion and then monitoring. As companies explore the value of Spark, Think Big has developed a “Spark Readiness Assessment” service to help align program goals with technology capabilities, provide a gap analysis of needed skills and processes, and recommendations to meet your big data objectives with Spark. Learn more about our data science services.

 

Think Big Academy

Through our training branch–Think Big Academy—we offer Spark training for corporate clients. Led by experienced instructors, these classes help train managers, developers, and administrators on using Spark and its various modules including machine learning, graph, streaming and query:

 

• Introduction to Spark
• Using Spark SQL & Dataframes
• Using Spark for Data Science
• Using Spark for Machine Learning (MLlib)

 

X