This course covers how Hive emulates SQL in a Hadoop cluster. The course also covers dataflow languages and how to create efficient data flows through Hadoop using Pig.
Who is the course for
- Data Analysts, Architects, Developers and System administrators
Attendees should have:
- Cluster and executing various commands and sample programs
- Connected to a Hadoop cluster via SSH and web browser
- Some basic familiarity with both Hadoop and traditional SQL database concepts is helpful but not required
What you will learn
- Learn about data types and storage formats in Hive
- Design Databases, Tables, Indexes, and Views
- How to analyze and manipulate data using Hive on large data sets
- Includes Hands on lab
- Learn concept of a Data Flow Language and how to create efficient data flows through Hadoop
- Explore what data is supported and the different forms it might take
- Learn how Pig can be used to transform data. Explore some advanced topics on debugging flows, flow optimization and some enterprise level features of Pig.
- 50% Lecture
- 50% Labs
Related Training Courses
MapR: Developing Hadoop Applications This 3-day course provides instruction on how to write Hadoop application using MapReduce and YARN in Java.
MapR: Hadoop Operations: Cluster Administration This 3-day course is designed to teach Hadoop administrators how to install, configure and maintain a MapR Hadoop cluster.
MapR: HBase Applications and Design Build This 3-day course introduces the concepts of NoSQL technologies, HBase architecture, schema design, performance tuning, bulk-loading of data and the storing of complex data structures.No Events on The List at This Time