Hadoop yes Apache An open source distributed computing platform of Software Foundation .

<>Hadoop The core subsystem of is described below :

* HDFS : A highly fault tolerant distributed file system , Suitable for deployment on a large number of cheap machines , Provide high
Throughput data access .
* YARN (Yet Another Resource Negotiator): Resource Manager , It can provide unified information for upper application
swim , Management and scheduling , Compatible with multiple computing frameworks .
* MapReduce : Is a distributed programming model , Distribute the processing of large data sets ( Map ) To the network
Multiple nodes on , After that, the processing results are collected for specification ( Reduce ) .
<> Use official image

Pull image
docker pull sequenceiq/hadoop -docker:2.7.0
After pulling the image , use docker ru Port instruction running image , Open at the same time bash command line :
docker run -it sequenceiq/hadoop-docker:2 7.0 /etc/b oots trap.sh -bash
Operation case
bash-4.1# cd $HADOOP_PREFIXbash-4 1# pwd/usr/local/hadoop bin/hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0 jar grep input output ’
dfs [a z.] +’

Technology