Top

NSD ARCHITECTURE DAY03

  1. 案例1:部署Hadoop
  2. 案例2:准备集群环境
  3. 案例3:配置Hadoop集群
  4. 案例4:初始化并验证集群
  5. 案例5:mapreduce模板案例
  6. 案例6:部署Yarn

1 案例1:部署Hadoop

1.1 问题

本案例要求安装单机模式Hadoop:

1.2 步骤

实现此案例需要按照如下步骤进行。

步骤一:环境准备

1)配置主机名为hadoop1,ip为192.168.1.50,配置yum源(系统源)

备注:由于在之前的案例中这些都已经做过,这里不再重复,不会的学员可以参考之前的案例

2)安装java环境

  1. [root@hadoop1 ~]# yum -y install java-1.8.0-openjdk-devel
  2. [root@hadoop1 ~]# java -version
  3. openjdk version "1.8.0_131"
  4. OpenJDK Runtime Environment (build 1.8.0_131-b12)
  5. OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)
  6. [root@hadoop1 ~]# jps
  7. 1235 Jps

3)安装hadoop

  1. [root@hadoop1 ~]# cd hadoop/
  2. [root@hadoop1 hadoop]# ls
  3. hadoop-2.7.7.tar.gz kafka_2.12-2.1.0.tgz zookeeper-3.4.13.tar.gz
  4. [root@hadoop1 hadoop]# tar -xf hadoop-2.7.7.tar.gz
  5. [root@hadoop1 hadoop]# mv hadoop-2.7.7 /usr/local/hadoop
  6. [root@hadoop1 hadoop]# cd /usr/local/hadoop
  7. [root@hadoop1 hadoop]# ls
  8. bin include libexec NOTICE.txt sbin
  9. etc lib LICENSE.txt README.txt share
  10. [root@hadoop1 hadoop]# ./bin/hadoop //报错,JAVA_HOME没有找到
  11. Error: JAVA_HOME is not set and could not be found.
  12. [root@hadoop1 hadoop]#

4)解决报错问题

  1. [root@hadoop1 hadoop]# rpm -ql java-1.8.0-openjdk
  2.  
  3. [root@hadoop1 hadoop]# cd ./etc/hadoop/
  4. [root@hadoop1 hadoop]# vim hadoop-env.sh
  5. 25 export JAVA_HOME="/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64 /jre"
  6.  
  7. 33 export HADOOP_CONF_DIR="/usr/local/hadoop/etc/hadoop"
  8. [root@hadoop1 ~]# cd /usr/local/hadoop/
  9. [root@hadoop1 hadoop]# ./bin/hadoop
  10. Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
  11. CLASSNAME run the class named CLASSNAME
  12. or
  13. where COMMAND is one of:
  14. fs run a generic filesystem user client
  15. version print the version
  16. jar <jar> run a jar file
  17. note: please use "yarn jar" to launch
  18. YARN applications, not this command.
  19. checknative [-a|-h] check native hadoop and compression libraries availability
  20. distcp <srcurl> <desturl> copy file or directories recursively
  21. archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  22. classpath prints the class path needed to get the
  23. credential interact with credential providers
  24. Hadoop jar and the required libraries
  25. daemonlog get/set the log level for each daemon
  26. trace view and modify Hadoop tracing settings
  27.  
  28. Most commands print help when invoked w/o parameters.

5)词频统计

  1. [root@hadoop1 hadoop]# mkdir /usr/local/hadoop/input
  2. [root@hadoop1 hadoop]# ls
  3. bin etc include lib libexec LICENSE.txt NOTICE.txt input README.txt sbin share
  4. [root@hadoop1 hadoop]# cp *.txt /usr/local/hadoop/input
  5. [root@hadoop1 hadoop]# ./bin/hadoop jar \
  6. share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount input output     //wordcount为参数 统计input这个文件夹,存到output这个文件里面(这个文件不能存在,要是存在会报错,是为了防止数据覆盖)
  7. [root@hadoop1 hadoop]# cat output/part-r-00000 //查看

2 案例2:准备集群环境

2.1 问题

本案例要求:

2.2 方案

准备四台虚拟机,由于之前已经准备过一台,所以只需再准备三台新的虚拟机即可,安装hadoop,使所有节点可以ping通,配置SSH信任关系,如图-1所示:

图-1

2.3 步骤

实现此案例需要按照如下步骤进行。

步骤一:环境准备

1)三台机器配置主机名为node-0001、node-0002、node-0003,配置ip地址(ip如图-1所示),yum源(系统源)

2)编辑/etc/hosts(四台主机同样操作,以hadoop1为例)

  1. [root@hadoop1 ~]# vim /etc/hosts
  2. 192.168.1.50 hadoop1
  3. 192.168.1.51 node-0001
  4. 192.168.1.52 node-0002
  5. 192.168.1.53 node-0003

3)安装java环境,在node-0001,node-0002,node-0003上面操作(以node-0001为例)

  1. [root@node-0001 ~]# yum -y install java-1.8.0-openjdk-devel

4)布置SSH信任关系

  1. [root@hadoop1 ~]# vim /etc/ssh/ssh_config //第一次登陆不需要输入yes
  2. Host *
  3. GSSAPIAuthentication yes
  4. StrictHostKeyChecking no
  5. [root@hadoop1 .ssh]# ssh-keygen
  6. Generating public/private rsa key pair.
  7. Enter file in which to save the key (/root/.ssh/id_rsa):
  8. Enter passphrase (empty for no passphrase):
  9. Enter same passphrase again:
  10. Your identification has been saved in /root/.ssh/id_rsa.
  11. Your public key has been saved in /root/.ssh/id_rsa.pub.
  12. The key fingerprint is:
  13. SHA256:Ucl8OCezw92aArY5+zPtOrJ9ol1ojRE3EAZ1mgndYQM root@hadoop1
  14. The key's randomart image is:
  15. +---[RSA 2048]----+
  16. | o*E*=. |
  17. | +XB+. |
  18. | ..=Oo. |
  19. | o.+o... |
  20. | .S+.. o |
  21. | + .=o |
  22. | o+oo |
  23. | o+=.o |
  24. | o==O. |
  25. +----[SHA256]-----+
  26. [root@hadoop1 .ssh]# for i in 61 62 63 64 ; do ssh-copy-id 192.168.1.$i; done
  27. //部署公钥给hadoop1,node-0001,node-0002,node-0003

5)测试信任关系

  1. [root@hadoop1 .ssh]# ssh node-0001
  2. Last login: Fri Sep 7 16:52:00 2018 from 192.168.1.60
  3. [root@node-0001 ~]# exit
  4. logout
  5. Connection to node-0001 closed.
  6. [root@hadoop1 .ssh]# ssh node-0002
  7. Last login: Fri Sep 7 16:52:05 2018 from 192.168.1.60
  8. [root@node-0002 ~]# exit
  9. logout
  10. Connection to node-0002 closed.
  11. [root@hadoop1 .ssh]# ssh node-0003

步骤二:配置hadoop

1)修改slaves文件

  1. [root@hadoop1 ~]# cd /usr/local/hadoop/etc/hadoop
  2. [root@hadoop1 hadoop]# vim slaves
  3. node-0001
  4. node-0002
  5. node-0003

2)hadoop的核心配置文件core-site

  1. [root@hadoop1 hadoop]# vim core-site.xml
  2. <configuration>
  3. <property>
  4. <name>fs.defaultFS</name>
  5. <value>hdfs://hadoop1:9000</value>
  6. </property>
  7. <property>
  8. <name>hadoop.tmp.dir</name>
  9. <value>/var/hadoop</value>
  10. </property>
  11. </configuration>
  12.  
  13. [root@hadoop1 hadoop]# mkdir /var/hadoop        //hadoop的数据根目录

3)配置hdfs-site文件

  1. [root@hadoop1 hadoop]# vim hdfs-site.xml
  2. <configuration>
  3. <property>
  4. <name>dfs.namenode.http-address</name>
  5. <value>hadoop1:50070</value>
  6. </property>
  7. <property>
  8. <name>dfs.namenode.secondary.http-address</name>
  9. <value>hadoop1:50090</value>
  10. </property>
  11. <property>
  12. <name>dfs.replication</name>
  13. <value>2</value>
  14. </property>
  15. </configuration>

3 案例3:配置Hadoop集群

3.1 问题

本案例要求完成hadoop的同步配置:

3.2 步骤

实现此案例需要按照如下步骤进行。

步骤一:同步

1)同步配置到node-0001,node-0002,node-0003

  1. [root@hadoop1 hadoop]# for i in 52 53 54 ; do rsync -aSH --delete /usr/local/hadoop/
  2. \ 192.168.1.$i:/usr/local/hadoop/ -e 'ssh' & done
  3. [1] 23260
  4. [2] 23261
  5. [3] 23262

2)查看是否同步成功

  1. [root@hadoop1 hadoop]# ssh node-0001 ls /usr/local/hadoop/
  2. bin
  3. etc
  4. include
  5. lib
  6. libexec
  7. LICENSE.txt
  8. NOTICE.txt
  9. output
  10. README.txt
  11. sbin
  12. share
  13. input
  14. [root@hadoop1 hadoop]# ssh node-0002 ls /usr/local/hadoop/
  15. bin
  16. etc
  17. include
  18. lib
  19. libexec
  20. LICENSE.txt
  21. NOTICE.txt
  22. output
  23. README.txt
  24. sbin
  25. share
  26. input
  27. [root@hadoop1 hadoop]# ssh node-0003 ls /usr/local/hadoop/
  28. bin
  29. etc
  30. include
  31. lib
  32. libexec
  33. LICENSE.txt
  34. NOTICE.txt
  35. output
  36. README.txt
  37. sbin
  38. share
  39. input

4 案例4:初始化并验证集群

4.1 问题

本案例要求初始化并验证集群:

4.2 步骤

实现此案例需要按照如下步骤进行。

步骤一:格式化

  1. [root@hadoop1 hadoop]# cd /usr/local/hadoop/
  2. [root@hadoop1 hadoop]# ./bin/hdfs namenode -format         //格式化 namenode
  3. [root@hadoop1 hadoop]# ./sbin/start-dfs.sh        //启动
  4. [root@hadoop1 hadoop]# jps        //验证角色
  5. 23408 NameNode
  6. 23700 Jps
  7. 23591 SecondaryNameNode
  8. [root@hadoop1 hadoop]# ./bin/hdfs dfsadmin -report        //查看集群是否组建成功
  9. Live datanodes (3):        //有三个角色成功

步骤二:web 页面验证

  1. firefox http://hadoop1:50070 (namenode)
  2. firefox http://hadoop1:50090 (secondarynamenode)
  3. firefox http://node-0001:50075 (datanode)

5 案例5:mapreduce模板案例

5.1 问题

本案例要求在 hadoop1 上拷贝 mapreduce 模板案例:

5.2 步骤

实现此案例需要按照如下步骤进行。

步骤一:部署mapred-site

1)配置mapred-site(hadoop1上面操作)

  1. [root@hadoop1 ~]# cd /usr/local/hadoop/etc/hadoop/
  2. [root@hadoop1 ~]# mv mapred-site.xml.template mapred-site.xml
  3. [root@hadoop1 ~]# vim mapred-site.xml
  4. <configuration>
  5. <property>
  6. <name>mapreduce.framework.name</name>
  7. <value>yarn</value>
  8. </property>
  9. </configuration>

6 案例6:部署Yarn

6.1 问题

本案例要求:

6.2 方案

在之前创建的 4 台虚拟机上部署 Yarn,如图-1所示:

图-2

6.3 步骤

实现此案例需要按照如下步骤进行。

步骤一:安装与部署hadoop

1)配置yarn-site(hadoop1上面操作)

  1. [root@hadoop1 hadoop]# vim yarn-site.xml
  2. <configuration>
  3.  
  4. <!-- Site specific YARN configuration properties -->
  5. <property>
  6. <name>yarn.resourcemanager.hostname</name>
  7. <value>hadoop1</value>
  8. </property>
  9. <property>
  10. <name>yarn.nodemanager.aux-services</name>
  11. <value>mapreduce_shuffle</value>
  12. </property>
  13. </configuration>

2)同步配置(hadoop1上面操作)

  1. [root@hadoop1 hadoop]# for i in {52..54}; do rsync -aSH --delete /usr/local/hadoop/ 192.168.1.$i:/usr/local/hadoop/ -e 'ssh' & done
  2. [1] 712
  3. [2] 713
  4. [3] 714

3)验证配置(hadoop1上面操作)

  1. [root@hadoop1 hadoop]# cd /usr/local/hadoop
  2. [root@hadoop1 hadoop]# ./sbin/start-dfs.sh
  3. Starting namenodes on [hadoop1]
  4. hadoop1: namenode running as process 23408. Stop it first.
  5. node-0001: datanode running as process 22409. Stop it first.
  6. node-0002: datanode running as process 22367. Stop it first.
  7. node-0003: datanode running as process 22356. Stop it first.
  8. Starting secondary namenodes [hadoop1]
  9. hadoop1: secondarynamenode running as process 23591. Stop it first.
  10. [root@hadoop1 hadoop]# ./sbin/start-yarn.sh
  11. starting yarn daemons
  12. starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-root-resourcemanager-hadoop1.out
  13. node-0002: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-node-0002.out
  14. node-0003: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-node-0003.out
  15. node-0001: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-node-0001.out
  16. [root@hadoop1 hadoop]# jps //hadoop1查看有ResourceManager
  17. 23408 NameNode
  18. 1043 ResourceManager
  19. 1302 Jps
  20. 23591 SecondaryNameNode
  21. [root@hadoop1 hadoop]# ssh node-0001 jps        //node-0001查看有NodeManager
  22. 25777 Jps
  23. 22409 DataNode
  24. 25673 NodeManager
  25. [root@hadoop1 hadoop]# ssh node-0002 jps        //node-0001查看有NodeManager
  26. 25729 Jps
  27. 25625 NodeManager
  28. 22367 DataNode
  29. [root@hadoop1 hadoop]# ssh node-0003 jps        //node-0001查看有NodeManager
  30. 22356 DataNode
  31. 25620 NodeManager
  32. 25724 Jps

4)web访问hadoop

  1. firefox http://hadoop1:8088 (resourcemanager)
  2. firefox http://node-0001:8042 (nodemanager)