Hadoop yes Apache An open source distributed computing platform of Software Foundation .
<>Hadoop The core subsystem of is described below ：
* HDFS ： A highly fault tolerant distributed file system , Suitable for deployment on a large number of cheap machines , Provide high
Throughput data access .
* YARN (Yet Another Resource Negotiator）： Resource Manager , It can provide unified information for upper application
swim , Management and scheduling , Compatible with multiple computing frameworks .
* MapReduce ： Is a distributed programming model , Distribute the processing of large data sets （ Map ） To the network
Multiple nodes on , After that, the processing results are collected for specification （ Reduce ） .
<> Use official image
docker pull sequenceiq/hadoop -docker:2.7.0
After pulling the image , use docker ru Port instruction running image , Open at the same time bash command line ：
docker run -it sequenceiq/hadoop-docker:2 7.0 /etc/b oots trap.sh -bash
bash-4.1# cd $HADOOP_PREFIXbash-4 1# pwd/usr/local/hadoop bin/hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0 jar grep input output ’
dfs [a z.] +’