This introductory course is designed to teach Hadoop administrators how to install, configure and maintain a MapR Hadoop cluster. You will learn how to test hardware for performance consistency, install The MapR Distribution including Apache Hadoop, create baseline benchmarks, and configure and maintain a Hadoop cluster. The course includes extensive hands-on labs with real-word system administrator scenarios, using Amazon Web Services (AWS) clusters.
Who is the course for
System administrators who are (or will be) in charge of installing, architecting and maintaining Hadoop – prior Hadoop knowledge is not required.
Attendees should have:
- Basic Hadoop knowledge
- A background in Linux system administration (are able to navigate the Linux file system, use an editor at the command-line interface, add users/groups, and execute common commands)
- A Linux system, PC or Mac with access to ssh and scp (using PuTTy, Cygwin, or similar tools)
What you will learn
Overview of big data
- The Big Data Challenge
- Hadoop Basics
- Introduction to Map/Reduce
Overview of the MapR Distribution including Apache Hadoop
Pre-install Cluster Considerations
- Testing CPU and RAM
- Testing Storage Throughout
- Testing Network Performance
- Service Layout and High Availability
- Cluster Planning
Installing the MapR Distribution
- Using Benchmarks
Preparing for Use
- Tuning the Cluster
- Allocating and Managing Resources
Configuring Cluster Storage Resources
- Planning Cluster Topologies
- Storage Components: Disks, Nodes, Storage Pools, Containers, and Volumes
- Working with Volumes
Data Ingestion, Access, and Availability
- Accessing the Cluster
- Creating Snapshots
- Creating Mirror Volume
- Multiple Clusters and Disaster Recovery
- MapR Tables and Data Ingestion
- Setting up Email
- Configuring and Responding to Alarms
- Cluster Monitoring and Tools
Ongoing Cluster Maintenance
- Analysing Tools and Logs
- Managing Services
- Adding, Removing and Replacing Disks
- Adding, Removing and Replacing Nodes
This course prepares you for the MapR Certified Hadoop Developer (MCHD) certification exam.
Related Training Courses
MapR: Hive and Pig This 2-day course covers how Hive emulates SQL in a Hadoop cluster, dataflow languages and how to create efficient data flows using Pig.
MapR: Developing Hadoop Applications This 3-day course provides instruction on how to write Hadoop application using MapReduce and YARN in Java.
MapR: HBase Applications and Design Build This 3-day course introduces the concepts of NoSQL technologies, HBase architecture, schema design, performance tuning, bulk-loading of data and the storing of complex data structures.