Search

Hadoop High-Availability Fully distributed with Docker

일일히 이미지를 생성하는것이 귀찮아서 Dockerfile과 docker-compose를 이용하였다.
구성은 다음과 같다.
Namenode: master01, master02
Datanode: slave01, slave02, slave03
Journalnode: master01, master02, slave01
기존 생성했던 Namenode 이미지 생성 과정을 살짝 수정하여 Dockerfile을 만들었다.
FROM centos:centos7 MAINTAINER Malachai <prussian1933@naver.com> RUN \ yum update -y && \ yum install net-tools -y && \ yum install vim-enhanced -y && \ yum install wget -y && \ yum install openssh-server openssh-clients openssh-askpass -y && \ yum install java-1.8.0-openjdk-devel.x86_64 -y && \ mkdir /opt/jdk && \ mkdir /opt/hadoop &&\ ln -s /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.x86_64 /opt/jdk/current WORKDIR /opt/hadoop RUN wget https://mirrors.sonic.net/apache/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz && \ tar xvzf hadoop-3.3.1.tar.gz && \ ln -s /opt/hadoop/hadoop-3.3.1 /opt/hadoop/current ENV JAVA_HOME=/opt/jdk/current ENV PATH=$PATH:$JAVA_HOME/bin ENV HADOOP_HOME=/opt/hadoop/current ENV PATH=$PATH:$HADOOP_HOME/bin ENV PATH=$PATH:$HADOOP_HOME/sbin ENV HADOOP_PID_DIR=/opt/hadoop/current/pids RUN ssh-keygen -f /etc/ssh/ssh_host_rsa_key -t rsa -N "" && \ ssh-keygen -f /etc/ssh/ssh_host_ecdsa_key -t ecdsa -N "" && \ ssh-keygen -f /etc/ssh/ssh_host_ed25519_key -t ed25519 -N "" RUN cd $HADOOP_HOME/etc/hadoop WORKDIR $HADOOP_HOME/etc/hadoop RUN echo \ $'export HADOOP_PID_DIR=/opt/hadoop/current/pids \n\ export JAVA_HOME=/opt/jdk/current \n\ export HDFS_NAMENODE_USER=\"root\" \n\ export HDFS_DATANODE_USER=\"root\" \n\ export HDFS_SECONDARYNAMENODE_USER=\"root\" \n\ export YARN_RESOURCEMANAGER_USER=\"root\" \n\ export YARN_NODEMANAGER_USER=\"root\" ' >> hadoop-env.sh && \ echo \ $'sh-keygen -q -t rsa -N "" -f ~/.ssh/id_rsa <<<y \n\ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys \n\ /usr/sbin/sshd' >> $HOME/.bashrc COPY core-site.xml . COPY hdfs-site.xml . COPY yarn-site.xml . COPY mapred-site.xml . COPY workers . ENTRYPOINT ["/bin/bash"]
Docker
복사
*RUN이 호출될떄마다 Docker 이미지가 캐싱되기 떄문에 시간이 오래걸리는 wget, install을 제외하고는 &&을 이용하여 묶어주는것이 좋다.
*.bashrc 스크립트는 bash 콘솔이 열릴 때마다 실행된다. 컨테이너가 시작될 때 실행할 커맨드를 적어주면 좋다. >> 연산자를 사용하여 원본 파일에 append 하였다.
*비밀번호 없이 SSH통신을 하기 위해 RSA키를 생성해주었다. volume mount를 통해 로컬 키를 컨테이너에 올릴 수 있다고 하나, connection failed 오류가 반복돼 일단 내부적으로 생성해두었다. 이미지가 유출되면 키 또한 같이 유출 되어 보안에 문제가 생길 수 있으니 지향해야한다.
컨테이너에 복사되는 *-site.xml, workers 파일들은 Fully distributed 환경의 재사용을 위해 남겨두었다.
위 Dockerfile을 아용해 centos7/hadoop:namenode 이미지를 빌드하였다.
파생되는 HA용 이미지를 위한 Dockerfile은 다음과 같다.
FROM centos7/hadoop:namenode MAINTAINER Malachai <prussian1933@naver.com> RUN mkdir /opt/zookeeper && \ cd /opt/zookeeper && \ wget https://mirror.navercorp.com/apache/zookeeper/zookeeper-3.6.3/apache-zookeeper-3.6.3-bin.tar.gz &&\ tar xvfz apache-zookeeper-3.6.3-bin.tar.gz && \ ln -s /opt/zookeeper/apache-zookeeper-3.6.3-bin /opt/zookeeper/current && \ mkdir current/data ENV ZOOKEEPER_HOME=/opt/zookeeper/current ENV PATH=$PATH:$ZOOKEEPER_HOME/bin COPY zoo.cfg /opt/zookeeper/current/conf COPY core-site.xml $HADOOP_HOME/etc/hadoop COPY hdfs-site.xml $HADOOP_HOME/etc/hadoop COPY yarn-site.xml $HADOOP_HOME/etc/hadoop COPY mapred-site.xml $HADOOP_HOME/etc/hadoop RUN echo \ $'export HDFS_JOURNALNODE_USER=\"root\" \n\ export HDFS_ZKFC_USER=\"root\" \n\ export YARN_PROXYSERVER_USER=\"root\" ' >> $HADOOP_HOME/etc/hadoop/hadoop-env.sh && \ echo \ $'alias zoo-start="/opt/zookeeper/current/bin/zkServer.sh start" \n\ alias zoo-status="/opt/zookeeper/current/bin/zkServer.sh status" \n\ alias zoo-stop="/opt/zookeeper/current/bin/zkServer.sh stop" \n\ zoo-start' >> $HOME/.bashrc ENTRYPOINT ["/bin/bash"]
Docker
복사
zookeeper 설정을 위한 zoo.cfg, Hadoop 설정을 위한 *-site는 다음과 같다.
<!--core-site.xml--> <configuration> <property> <!--파일 시스템 이름, hdfs-site에서 노드 이름을 구분하는데 쓰인다--> <name>fs.defaultFS</name> <value>hdfs://hadoop-cluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>master01:2181,master02:2181,slave01:2181</value> </property> </configuration>
XML
복사
<!--hdfs-site.xml--> <configuration> <property> <name>dfs.namenode.name.dir</name> <value>/opt/hadoop/current/data/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/hadoop/current/data/datanode</value> </property> <property> <!--Journal node의 수정 로그를 저장하는 디렉토리--> <name>dfs.journalnode.edits.dir</name> <value>/opt/hadoop/current/data/journalnode</value> </property> <property> <name>dfs.nameservices</name> <value>hadoop-cluster</value> </property> <property> <!--파일시스템 내 노드 이름--> <name>dfs.ha.namenodes.hadoop-cluster</name> <value>nn01,nn02</value> </property> <property> <name>dfs.namenode.rpc-address.hadoop-cluster.nn01</name> <value>master01:8020</value> </property> <property> <name>dfs.namenode.rpc-address.hadoop-cluster.nn02</name> <value>master02:8020</value> </property> <property> <name>dfs.namenode.http-address.hadoop-cluster.nn01</name> <value>master01:50070</value> </property> <property> <name>dfs.namenode.http-address.hadoop-cluster.nn02</name> <value>master02:50070</value> </property> <property> <!--Journal node의 수정 로그를 공유하는 장소--> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master01:8485;master02:8485;slave01:8485/hadoop-cluster</value> </property> <property> <name>dfs.client.failover.proxy.provider.hadoop-cluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <!--fail시 노드를 막아주는 옵션, ssh옵션도 있다고 한다--> <name>dfs.ha.fencing.methods</name> <value>shell(/bin/true)</value> </property> <!-- Automatic failover configuration --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration>
XML
복사
<!--yarn-site.xml--> <configuration> <!-- Site specific YARN configuration properties --> <property> <!--MapReduce 사용 옵션--> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <!--MapReduce 사용 옵션--> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/opt/hadoopdata/yarn/nm-local-dir</value> </property> <property> <name>yarn.resourcemanager.fs.state-store.uri</name> <value>/opt/hadoopdata/yarn/system/rmstore</value> </property> <!-- for Resource Manager HA configuration --> <!-- 여기서부터 HA를 위한 옵션이다. --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster1</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm01,rm02</value> </property> <property> <name>yarn.resourcemanager.hostname.rm01</name> <value>master01</value> </property> <property> <name>yarn.resourcemanager.hostname.rm02</name> <value>master02</value> </property> <property> <name>yarn.web-proxy.address.rm01</name> <value>master01:8090</value> </property> <property> <name>yarn.web-proxy.address.rm02</name> <value>master02:8090</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm01</name> <value>master01:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm02</name> <value>master02:8088</value> </property> <property> <name>hadoop.zk.address</name> <value>master01:2181,master02:2181,slave01:2181</value> </property> </configuration>
XML
복사
<!--mapred-site.xml--> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> </configuration>
XML
복사
#workers slave01 slave02 slave03
Plain Text
복사
#zoo.cnf # Th number of milliseconds of each tick f milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/zookeeper/current/data # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients maxClientCnxns=60 maxSessionTimeout=180000 server.1=master01:2888:3888 server.2=master02:2888:3888 server.3=slave01:2888:3888 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1
Plain Text
복사
Zookeeper 노드들을 구분하기 위해선 myid라는 파일이 필요하다. dataDir 속성은 이 myid가 들어있는 디렉토리를 뜻한다.
각 이미지마다 다른 id 값이 필요하기 때문에 위 이미지를 centos7/hadoop/zookeeper:node로 빌드한 다음, argument를 받는 이미지를 Dockerfile로 작성하였다.
FROM centos7/hadoop/zookeeper:node MAINTAINER Malachai <prussian1933@naver.com> ARG MYID RUN echo $MYID > $ZOOKEEPER_HOME/data/myid ENTRYPOINT ["/bin/bash"]
Docker
복사
HA 환경을 구축해줄 compose파일은 다음과 같다.
version: "3.7" services: master01: build: context: /home/malachai/hadoop-ecosystem/centos7-hadoop-ha/myid args: - MYID=1 privileged: true container_name: master01 hostname: master01 volumes: - type: bind source: /home/malachai/hadoop-ecosystem/keys target: /root/.ssh networks: cluster-net: ipv4_address: 172.16.238.2 ports: - "9870:9870" - "8088:8088" - "8042:8042" - "50070:50070" - "8090:8090" extra_hosts: - "master02:172.16.238.3" - "slave01:172.16.238.4" - "slave02:172.16.238.5" - "slave03:172.16.238.6" stdin_open: true tty: true master02: build: context: /home/malachai/hadoop-ecosystem/centos7-hadoop-ha/myid args: - MYID=2 privileged: true container_name: master02 hostname: master02 volumes: - type: bind source: /home/malachai/hadoop-ecosystem/keys target: /root/.ssh networks: cluster-net: ipv4_address: 172.16.238.3 ports: - "9871:9870" - "8089:8088" - "8043:8042" - "50071:50070" - "8091:8091" extra_hosts: - "master01:172.16.238.2" - "slave01:172.16.238.4" - "slave02:172.16.238.5" - "slave03:172.16.238.6" stdin_open: true tty: true slave01: build: context: /home/malachai/hadoop-ecosystem/centos7-hadoop-ha/myid args: - MYID=3 privileged: true container_name: slave01 hostname: slave01 volumes: - type: bind source: /home/malachai/hadoop-ecosystem/keys target: /root/.ssh networks: cluster-net: ipv4_address: 172.16.238.4 extra_hosts: - "master01:172.16.238.2" - "master02:172.16.238.3" - "slave02:172.16.238.5" - "slave03:172.16.238.6" stdin_open: true tty: true slave02: image: centos7/hadoop/zookeeper:node privileged: true container_name: slave02 hostname: slave02 volumes: - type: bind source: /home/malachai/hadoop-ecosystem/keys target: /root/.ssh networks: cluster-net: ipv4_address: 172.16.238.5 extra_hosts: - "master01:172.16.238.2" - "master02:172.16.238.3" - "slave01:172.16.238.4" - "slave03:172.16.238.6" stdin_open: true tty: true slave03: image: centos7/hadoop/zookeeper:node privileged: true container_name: slave03 hostname: slave03 volumes: - type: bind source: /home/malachai/hadoop-ecosystem/keys target: /root/.ssh networks: cluster-net: ipv4_address: 172.16.238.6 extra_hosts: - "master01:172.16.238.2" - "master02:172.16.238.3" - "slave01:172.16.238.4" - "slave02:172.16.238.5" stdin_open: true tty: true networks: cluster-net: ipam: driver: default config: - subnet: "172.16.238.0/24"
YAML
복사
master01, master02, slave01에 MYID를 전달해 빌드하게 만들었다. 빌드된 이미지는 id 수만큼 캐싱되니 가끔 정리해주는것이 좋을것 같다.
서비스 포트 기본값은 다음과 같다.
시스템을 실행하는 순서는 다음과 같다. 컨테이너를 이동하며 실행해주어야 한다.
[master01 ~]$ $HADOOP_HOME/bin/hdfs zkfc -formatZK [master01 ~]$ $HADOOP_HOME/bin/hdfs --daemon start journalnode [master01 ~]$ start-all.sh # most effient way to initialize SSH connections [master01 ~]$ hdfs namenode -format [master01 ~]$ $HADOOP_HOME/bin/hdfs --daemon start namenode [master02 ~]$ hdfs namenode -bootstrapStandby [masterp02 ~]$ $HADOOP_HOME/bin/hdfs --daemon start namenode [master01 ~]$ $HADOOP_HOME/bin/hdfs --daemon start zkfc [master02 ~]$ $HADOOP_HOME/bin/hdfs --daemon start zkfc [slave01 ~]$ $HADOOP_HOME/bin/hdfs --daemon start datanode [slave02 ~]$ $HADOOP_HOME/bin/hdfs --daemon start datanode [slave03 ~]$ $HADOOP_HOME/bin/hdfs --daemon start datanode [master01 ~]$ start-yarn.sh [master01 ~]$ $HADOOP_HOME/bin/mapred --daemon start historyserver [master02 ~]$ $HADOOP_HOME/bin/mapred --daemon start historyserver
Bash
복사