博客 > 术业专攻> 云计算> kubernetes> Etcd集群-kubernetes-1.8.6集群搭建 2019年08月29日 11:25:44
这将会是一个系列文章在本站,如果你也想学习k8s-1.8版本的安装,那么可以根据这个教程进行搭建以及学习,同时也欢迎提出在搭建过程中遇到的任何问题,与博主一起探讨,毕竟我个人在搭建的时候也遇到过许许多多的坑,不过好在心态还不错,一点一点倒是也走出来了。一句话,世上无难事,只怕有心人。
汇总地址如右所示:k8s系列汇总。
在三个节点都安装etcd,下面的操作需要在三个节点都执行一遍。
# wget https://github.com/coreos/etcd/releases/download/v3.2.12/etcd-v3.2.12-linux-amd64.tar.gz # tar -xvf etcd-v3.2.12-linux-amd64.tar.gz # sudo mv etcd-v3.2.12-linux-amd64/etcd* /usr/local/bin
sudo mkdir -p /var/lib/etcd
这个地方为了避免跳坑里边,我直接三台主机的配置都贴出来了。
cat > etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] Type=notify WorkingDirectory=/var/lib/etcd/ ExecStart=/usr/local/bin/etcd \\ --name master \\ //修改此处 --cert-file=/etc/kubernetes/ssl/kubernetes.pem \\ --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\ --peer-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\ --peer-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\ --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\ --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\ --initial-advertise-peer-urls https://192.168.106.3:2380 \\ //此处 --listen-peer-urls https://192.168.106.3:2380 \\ //此处 --listen-client-urls https://192.168.106.3:2379,http://127.0.0.1:2379 \\ //此处 --advertise-client-urls https://192.168.106.3:2379 \\ //此处 --initial-cluster-token etcd-cluster-0 \\ --initial-cluster master=https://192.168.106.3:2380,node01=https://192.168.106.4:2380,node02=https://192.168.106.5:2380 \\ //此处 不要忽略name --initial-cluster-state new \\ --data-dir=/var/lib/etcd Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
cat > etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] Type=notify WorkingDirectory=/var/lib/etcd/ ExecStart=/usr/local/bin/etcd \\ --name node01 \\ --cert-file=/etc/kubernetes/ssl/kubernetes.pem \\ --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\ --peer-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\ --peer-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\ --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\ --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\ --initial-advertise-peer-urls https://192.168.106.4:2380 \\ --listen-peer-urls https://192.168.106.4:2380 \\ --listen-client-urls https://192.168.106.4:2379,http://127.0.0.1:2379 \\ --advertise-client-urls https://192.168.106.4:2379 \\ --initial-cluster-token etcd-cluster-0 \\ --initial-cluster master=https://192.168.106.3:2380,node01=https://192.168.106.4:2380,node02=https://192.168.106.5:2380 \\ --initial-cluster-state new \\ --data-dir=/var/lib/etcd Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
cat > etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] Type=notify WorkingDirectory=/var/lib/etcd/ ExecStart=/usr/local/bin/etcd \\ --name node02 \\ --cert-file=/etc/kubernetes/ssl/kubernetes.pem \\ --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\ --peer-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\ --peer-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\ --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\ --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\ --initial-advertise-peer-urls https://192.168.106.5:2380 \\ --listen-peer-urls https://192.168.106.5:2380 \\ --listen-client-urls https://192.168.106.5:2379,http://127.0.0.1:2379 \\ --advertise-client-urls https://192.168.106.5:2379 \\ --initial-cluster-token etcd-cluster-0 \\ --initial-cluster master=https://192.168.106.3:2380,node01=https://192.168.106.4:2380,node02=https://192.168.106.5:2380 \\ --initial-cluster-state new \\ --data-dir=/var/lib/etcd Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
下面给出常用配置的参数和它们的解释,方便理解:
--name
:方便理解的节点名称,默认为 default,在集群中应该保持唯一,可以使用 hostname--data-dir
:服务运行数据保存的路径,默认为 ${name}.etcd--snapshot-count
:指定有多少事务(transaction)被提交时,触发截取快照保存到磁盘--heartbeat-interval
:leader 多久发送一次心跳到 followers。默认值是 100ms--eletion-timeout
:重新投票的超时时间,如果 follow 在该时间间隔没有收到心跳包,会触发重新投票,默认为 1000 ms--listen-peer-urls
:和同伴通信的地址,比如 http://ip:2380,如果有多个,使用逗号分隔。需要所有节点都能够访问,所以不要使用 localhost!--listen-client-urls
:对外提供服务的地址:比如 http://ip:2379,http://127.0.0.1:2379,客户端会连接到这里和 etcd 交互--advertise-client-urls
:对外公告的该节点客户端监听地址,这个值会告诉集群中其他节点--initial-advertise-peer-urls
:该节点同伴监听地址,这个值会告诉集群中其他节点--initial-cluster
:集群中所有节点的信息,格式为 node1=http://ip1:2380,node2=http://ip2:2380,…。注意:这里的 node1 是节点的 –name 指定的名字;后面的 ip1:2380 是 –initial-advertise-peer-urls 指定的值--initial-cluster-state
:新建集群的时候,这个值为 new;假如已经存在的集群,这个值为 existing--initial-cluster-token
:创建集群的 token,这个值每个集群保持唯一。这样的话,如果你要重新创建集群,即使配置和之前一样,也会再次生成新的集群和节点 uuid;否则会导致多个集群之间的冲突,造成未知的错误NOTE:所有的参数也可以通过环境变量进行设置,–my-flag 对应环境变量的 ETCD_MY_FLAG;但是命令行指定的参数会覆盖环境变量对应的值。
为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
创建 kubernetes.pem 证书时使用的 kubernetes-csr.json 文件的 hosts 字段包含所有 etcd 节点的IP,否则证书校验会出错;
–initial-cluster-state 值为 new 时,–name 的参数值必须位于 –initial-cluster 列表中;
cp etcd.service /etc/systemd/system/ systemctl daemon-reload systemctl enable etcd systemctl start etcd systemctl status etcd
如果在执行开机自启的时候报错Bad message
,那么一般就是上边的配置信息有问题,这个时候重新检查配置信息,然后重新加载重新启动即可。
如上操作请确认一定要在三台机器上面都要执行(单节点ETCD除外)。
在任何一个etcd节点执行如下命令(如果不添加密钥参数是会报错的):
[root@master ~]$etcdctl cluster-health failed to check the health of member 4838d3e6217ff2a1 on https://192.168.106.4:2379: Get https://192.168.106.4:2379/health: x509: certificate signed by unknown authority member 4838d3e6217ff2a1 is unreachable: [https://192.168.106.4:2379] are all unreachable failed to check the health of member d4635870b1fea87f on https://192.168.106.5:2379: Get https://192.168.106.5:2379/health: x509: certificate signed by unknown authority member d4635870b1fea87f is unreachable: [https://192.168.106.5:2379] are all unreachable failed to check the health of member fa16f2892f13a4d6 on https://192.168.106.3:2379: Get https://192.168.106.3:2379/health: x509: certificate signed by unknown authority member fa16f2892f13a4d6 is unreachable: [https://192.168.106.3:2379] are all unreachable cluster is unhealthy
使用密钥方式检查集群状态:
[root@master member]$etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem --cert-file=/etc/kubernetes/ssl/kubernetes.pem --key-file=/etc/kubernetes/ssl/kubernetes-key.pem cluster-health member 4838d3e6217ff2a1 is healthy: got healthy result from https://192.168.106.4:2379 member fa16f2892f13a4d6 is healthy: got healthy result from https://192.168.106.3:2379 cluster is healthy
[root@node01 k8s]$etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem --cert-file=/etc/kubernetes/ssl/kubernetes.pem --key-file=/etc/kubernetes/ssl/kubernetes-key.pem cluster-health failed to check the health of member 4838d3e6217ff2a1 on https://192.168.106.4:2379: Get https://192.168.106.4:2379/health: x509: certificate has expired or is not yet valid member 4838d3e6217ff2a1 is unreachable: [https://192.168.106.4:2379] are all unreachable failed to check the health of member d4635870b1fea87f on https://192.168.106.5:2379: Get https://192.168.106.5:2379/health: x509: certificate has expired or is not yet valid member d4635870b1fea87f is unreachable: [https://192.168.106.5:2379] are all unreachable failed to check the health of member fa16f2892f13a4d6 on https://192.168.106.3:2379: Get https://192.168.106.3:2379/health: x509: certificate has expired or is not yet valid member fa16f2892f13a4d6 is unreachable: [https://192.168.106.3:2379] are all unreachable cluster is unhealthy
问题原因
:可能是主机时间不同步的原因。解决办法
:令其同步,分别在三台主机上执行如下命令。yum -y install ntp && ntpdate -u cn.pool.ntp.org
下载安装Flannel:
# wget https://github.com/coreos/flannel/releases/download/v0.9.1/flannel-v0.9.1-linux-amd64.tar.gz # mkdir flannel # tar -xzvf flannel-v0.9.1-linux-amd64.tar.gz -C flannel # sudo cp flannel/{flanneld,mk-docker-opts.sh} /usr/local/bin
向 etcd 写入网段信息,这两个命令只需要任意一个节点上执行一次就可以。
mkdir -p /kubernetes/network/config etcdctl --endpoints=https://192.168.106.3:2379,https://192.168.106.4:2379,https://192.168.106.5:2379 \ --ca-file=/etc/kubernetes/ssl/ca.pem \ --cert-file=/etc/kubernetes/ssl/kubernetes.pem \ --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \ set /kubernetes/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'
创建systemd unit 文件 在三台机器上面都需要执行:
cat > flanneld.service << EOF [Unit] Description=Flanneld overlay address etcd agent After=network.target After=network-online.target Wants=network-online.target After=etcd.service Before=docker.service [Service] Type=notify ExecStart=/usr/local/bin/flanneld \\ -etcd-cafile=/etc/kubernetes/ssl/ca.pem \\ -etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem \\ -etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem \\ -etcd-endpoints=https://192.168.106.3:2379,https://192.168.106.4:2379,https://192.168.106.5:2379 \\ -etcd-prefix=/kubernetes/network ExecStartPost=/usr/local/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker Restart=on-failure [Install] WantedBy=multi-user.target RequiredBy=docker.service EOF
flanneld 使用系统缺省路由所在的接口和其它节点通信,对于有多个网络接口的机器(如,内网和公网),可以用 -iface=enpxx 选项值指定通信接口。
mv flanneld.service /etc/systemd/system/ systemctl daemon-reload systemctl enable flanneld systemctl start flanneld systemctl status flanneld
[root@master ~]# /usr/local/bin/etcdctl --endpoints=https://192.168.106.3:2379,https://192.168.106.4:2379,https://192.168.106.5:2379 --ca-file=/etc/kubernetes/ssl/ca.pem --cert-file=/etc/kubernetes/ssl/kubernetes.pem --key-file=/etc/kubernetes/ssl/kubernetes-key.pem ls /kubernetes/network/subnets /kubernetes/network/subnets/172.30.31.0-24 /kubernetes/network/subnets/172.30.63.0-24 /kubernetes/network/subnets/172.30.76.0-24
由此可以看出,如上三个节点pod的网段!
部署Flannel网络,kubernetes要求集群内各节点能通过Pod网段互联互通:
$ ping 172.30.31.0 $ ping 172.30.63.0 $ ping 172.30.76.0
© 2018 www.qingketang.net 鄂ICP备18027844号-1
武汉快勤科技有限公司 13554402156 武汉市东湖新技术开发区关山二路特一号国际企业中心6幢4层7号
扫码关注,全站教程免费播放
订单金额:
支付金额:
支付方式: