User Id:
Password:
 
Forgot password || New Here? Click to Sign Up

Call us: 419-408-3178

Email:training@verity-sol.com

What is Hadoop:  is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity hardware. Essentially, it accomplishes two tasks: massive data storage and faster processing.

Hadoop Concepts

Distributed Reliable File System

  •  Apache Hadoop Distributed File System (HDFS)
  •  Inspired by Google File System
  •  Single Logical View of distributed Linux File Systems

 Data typically is Replicate 3 times

  •  Fault Tolerant
  •  Better I/O

Distributed Compute Framework & Resource Manager

  •  Apache MapReduce and YARN
  •  Inspired by Goolge MapReduce

HDFS Blocks

Files are broken into chunks