Apache Hadoop Essentials

Book Now


This course provides a technical overview of Apache Hadoop. It includes high-level information about concepts, architecture, operation, and uses of the Hortonworks Data Platform (HDP) and the Hadoop ecosystem. The course provides an optional primer for those who plan to attend a hands-on, instructor-led course.


1 day

Who is the course for

Data architects, data integration architects, managers, C-level executives, decision makers, technical infrastructure team, and Hadoop administrators or developers who want to understand the fundamentals of big data and the Hadoop ecosystem.


No previous Hadoop or programming knowledge is required. Students will need browser access to the Internet.

Course Objectives

  • Describe the use case for Hadoop
    • Idenfity Hadoop Ecosystem architectural categories
    • Data Management
    • Data Access
    • Data Governance and Integration
    • Security
    • Operations
  • Detail the HDFS architecture
  • Describe data ingestion options and frameworks for batch and real-time streaming
  • Explain the fundamentals of parallel processing
  • See popular data transformation and processing engines in action
    • Apache Hive
    • Apache Pig
    • Spark
  • Detail the architecture and features of YARN
  • Describe how to secure Hadoop

Course Outline

  • Operational overview with Ambari
  • Loading data into HDFS
  • Data manipulation with Hive
  • Risk Analysis with Pig
  • Risk Analysis with Spark and Zeppelin
  • Securing Hive with Ranger


Lecture based

Additional Information

The course content can be customised to cover any specialised material you may require for your specific training needs.

This course can be offered as private on-site training hosted at your offices. For more information, please contact us at [email protected].


Related Training Courses

Apache Cassandra This is a fast-paced, vendor agnostic technical Apache Cassandra course that focuses on the key aspects of the technology for developers and system operations staff, covering core internal and distributed architecture fundamentals.

HDP Analyst: Apache Hbase Essentials​ This 2-day workshop introduces HBase basics, structure and operations in an intensely hands-on experience.

Big Data Concepts This one-day class is an executive briefing on big data designed for senior management and business leaders to learn about big data concepts and familiarise themselves with the business and technology trends and opportunities.

Machine Learning with Apache Hadoop This course is designed to help attendees understand the high-level concepts and classifications of machine learning systems with a strong focus on building Recommender Systems.

No Events on The List at This Time