1、簡介
1.1前言
Kubernetes設計了網絡模型,但卻將它的實現講給了網絡插件,CNI網絡插件最重要的功能就是實現Pod資源能夠跨主機通信
常見的CNI網絡插件如下:
Flannel;
Cacliao;
Canal;
Contiv;
OpenContrail;
NSX-T;
Kube-router。
1.2 Flannel的三種網絡模型
host-gw模型:所有的node ip必須在同一個物理網關設備下才能使用,他的原理就是:給宿主機添加一個靜態路由
Vxlan模型:當宿主機不在同一個網段下(不適用同一個網關),就可以使用此模型,相當于在每臺宿主機上添加了一個虛擬網絡設備,通過一個虛擬隧道進行通信。
直接路由模型:當node不在同一個物理網關,走vxlan模型,當在同一個路由下,走host-gw模型
2、集群規劃
部署以10.4.7.21為例,10.4.7.22部署類似
主機名角色IP hdss7-21Flannel10.4.7.21 hdss7-22Flannel10.4.7.22
3、下載軟件,解壓,做軟連接
下載地址:https://github.com/flannel-io/flannel/
[root@hdss7-21 ~]# cd /opt/src/ [root@hdss7-21 src]# wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz [root@hdss7-21 src]# mkdir /opt/flannel-v0.11.0 [root@hdss7-21 src]# tar -zxvf flannel-v0.11.0-linux-amd64.tar.gz -C /opt/flannel-v0.11.0 [root@hdss7-21 src]# ln -s /opt/flannel-v0.11.0 /opt/flannel
4、最終目錄結構
[root@hdss7-21 src]# cd /opt/flannel [root@hdss7-21 flannel]# ls flanneld mk-docker-opts.sh README.md
5、拷貝client證書
作為etcd的客戶端
[root@hdss7-21 flannel]# mkdir cert [root@hdss7-21 flannel]# cd cert/ root@hdss7-21 cert]# scp hdss7-200:/opt/certs/ca.pem . root@hdss7-21 cert]# scp hdss7-200:/opt/certs/client.pem . root@hdss7-21 cert]# scp hdss7-200:/opt/certs/client-key.pem . root@hdss7-21 cert]# ll 總用量 12 -rw-r--r-- 1 root root 1346 6月 13 21:55 ca.pem -rw------- 1 root root 1679 6月 13 21:56 client-key.pem -rw-r--r-- 1 root root 1363 6月 13 21:55 client.pem
6、創建配置
注意:flannel集群各主機的配置不同,部署其他節點時注意修改,10.4.7.22上FLANNEL_SUBNET=172.7.22.1/24
[root@hdss7-21 cert]# cd .. [root@hdss7-21 flannel]# vim subnet.env FLANNEL_NETWORK=172.7.0.0/16 FLANNEL_SUBNET=172.7.21.1/24 FLANNEL_MTU=1500 FLANNEL_IPMASQ=false
7、創建啟動腳本
注意IP(--public-ip=)與網絡接口(--iface=)修改
[root@hdss7-21 flannel]# vim flanneld.sh #!/bin/sh ./flanneld --public-ip=10.4.7.21 --etcd-endpoints=https://10.4.7.12:2379,https://10.4.7.21:2379,https://10.4.7.22:2379 --etcd-keyfile=./cert/client-key.pem --etcd-certfile=./cert/client.pem --etcd-cafile=./cert/ca.pem --iface=ens33 --subnet-file=./subnet.env --healthz-port=2401
8、檢查配置、權限、創建日志目錄
[root@hdss7-21 flannel]# chmod +x flanneld.sh [root@hdss7-21 flannel]# mkdir -p /data/logs/flanneld
9、操作etcd,增加host-gw
在etcd集群中,可以在12,21,22上任意一臺上操作
[root@hdss7-21 flannel]# cd /opt/etcd 注意網絡地址 [root@hdss7-21 etcd]# ./etcdctl set /coreos.com/network/config '{"Network": "172.7.0.0/16", "Backend": {"Type": "host-gw"}}' {"Network": "172.7.0.0/16", "Backend": {"Type": "host-gw"}} 查看 [root@hdss7-21 etcd]# ./etcdctl get /coreos.com/network/config {"Network": "172.7.0.0/16", "Backend": {"Type": "host-gw"}} 查看一下etcd集群信息 [root@hdss7-21 etcd]# ./etcdctl member list 988139385f78284: name=etcd-server-7-22 peerURLs=https://10.4.7.22:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=false 5a0ef2a004fc4349: name=etcd-server-7-21 peerURLs=https://10.4.7.21:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=false f4a0cb0a765574a8: name=etcd-server-7-12 peerURLs=https://10.4.7.12:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=true
10、創建supervisor配置
[root@hdss7-21 etcd]# vim /etc/supervisord.d/flannel.ini [program:flanneld-7-21] command=/opt/flannel/flanneld.sh ; the program (relative uses PATH, can take args) numprocs=1 ; number of processes copies to start (def 1) directory=/opt/flannel ; directory to cwd to before exec (def no cwd) autostart=true ; start at supervisord start (default: true) autorestart=true ; retstart at unexpected quit (default: true) startsecs=30 ; number of secs prog must stay running (def. 1) startretries=3 ; max # of serial start failures (default 3) exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) stopsignal=QUIT ; signal used to kill process (default TERM) stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) user=root ; setuid to this UNIX account to run the program redirect_stderr=true ; redirect proc stderr to stdout (default false) stdout_logfile=/data/logs/flanneld/flanneld.stdout.log ; stderr log path, NONE for none; default AUTO stdout_logfile_maxbytes=64MB ; max # logfile bytes b4 rotation (default 50MB) stdout_logfile_backups=4 ; # of stdout logfile backups (default 10) stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) stdout_events_enabled=false ; emit events on stdout writes (default false)
11、啟動服務并檢查
[root@hdss7-21 etcd]# supervisorctl update flanneld-7-21: added process group [root@hdss7-21 etcd]# tail -100f /data/logs/flanneld/flanneld.stdout.log I0613 22:58:53.387404 33642 main.go:527] Using interface with name ens33 and address 10.4.7.21 I0613 22:58:53.387528 33642 main.go:540] Using 10.4.7.21 as external address 2021-06-13 22:58:53.388681 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated I0613 22:58:53.388761 33642 main.go:244] Created subnet manager: Etcd Local Manager with Previous Subnet: 172.7.21.0/24 I0613 22:58:53.388772 33642 main.go:247] Installing signal handlers I0613 22:58:53.396039 33642 main.go:587] Start healthz server on 0.0.0.0:2401 I0613 22:58:53.418509 33642 main.go:386] Found network config - Backend type: host-gw I0613 22:58:53.432160 33642 local_manager.go:201] Found previously leased subnet (172.7.21.0/24), reusing I0613 22:58:53.440005 33642 local_manager.go:220] Allocated lease (172.7.21.0/24) to current node (10.4.7.21) I0613 22:58:53.440627 33642 main.go:317] Wrote subnet file to ./subnet.env I0613 22:58:53.440644 33642 main.go:321] Running backend. I0613 22:58:53.441000 33642 route_network.go:53] Watching for new subnet leases I0613 22:58:53.452691 33642 main.go:429] Waiting for 22h59m59.980128025s to renew lease I0613 22:58:53.452920 33642 iptables.go:145] Some iptables rules are missing; deleting and recreating rules I0613 22:58:53.452975 33642 iptables.go:167] Deleting iptables rule: -s 172.7.0.0/16 -j ACCEPT I0613 22:58:53.464074 33642 iptables.go:167] Deleting iptables rule: -d 172.7.0.0/16 -j ACCEPT I0613 22:58:53.471984 33642 iptables.go:155] Adding iptables rule: -s 172.7.0.0/16 -j ACCEPT I0613 22:58:53.497924 33642 iptables.go:155] Adding iptables rule: -d 172.7.0.0/16 -j ACCEPT
12、部署集群其他節點,檢查所有集群服務
上面的3-11(etcd不需要操作了)步驟在10.4.7.22上操作部署;
等22上的flannel服務起來之后檢查路由,發現增加了到22容器網段通過10.4.7.22主機的路由(22也有到21的網絡),這就是flannel做的事情(但是注意,物理主機需要在同一個網關下才能使用host-gw網絡模型)
查看路由 [root@hdss7-21 etcd]# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default gateway 0.0.0.0 UG 100 0 0 ens33 10.4.7.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33 172.7.21.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 172.7.22.0 10.4.7.22 255.255.255.0 UG 0 0 0 ens33 22上有道21的網關路由: [root@hdss7-21 etcd]# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default gateway 0.0.0.0 UG 100 0 0 ens33 10.4.7.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33 172.7.21.0 10.4.7.21 255.255.255.0 UG 0 0 0 ens33 172.7.22.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
13、再次驗證集群,POD網路互通
在21上訪問22上的容器,已經可以互通了
[root@hdss7-21 etcd]# curl 172.7.22.2Welcome to nginx! Welcome to nginx!
If you see this page, the nginx web server is successfully installed and working. Further configuration is required.
For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.Thank you for using nginx.
在22上訪問21上的容器
[root@hdss7-22 flannel]# curl 172.7.21.2Welcome to nginx! Welcome to nginx!
If you see this page, the nginx web server is successfully installed and working. Further configuration is required.
For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.Thank you for using nginx.
14、在各個運算節點上優化iptables規則
為什么要做iptables規則優化:默認主機上是做好了snat的規則轉換,就是容器訪問到容器時,在目標容器內顯示的源地址是主機地址,而不是源容器地址,如下所示,在10.4.7.21上的172.7.21.2容器訪問172.7.22.2容器時,在172.7.22.2內查看到源地址的是10.4.7.21,這樣就不知道到底是哪個具體容器進行訪問的了,我們要做的就是把容器到容器的iptables的snat轉換規則去掉。
以下操作在10.4.7.21上操作
14.1 容器添加附加curl功能
修改使用的鏡像為curl鏡像(之前在第四章部署harbor的時候創建過此鏡像)
[root@hdss7-21 ~]# vim /opt/kubernetes/conf/nginx-ds.yaml apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: nginx-ds spec: template: metadata: labels: app: nginx-ds spec: containers: - name: my-nginx image: harbor.od.com/public/nginx:curl ports: - containerPort: 80 [root@hdss7-21 ~]# kubectl apply -f /opt/kubernetes/conf/nginx-ds.yaml daemonset.extensions/nginx-ds configured
用新的鏡像替換原來的容器
~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-ds-spdgm 1/1 Running 2 3d 172.7.21.2 hdss7-21.host.comnginx-ds-sx7hn 1/1 Running 0 111s 172.7.22.2 hdss7-22.host.com 原來的容器被kill后會根據期望自動生成新的容器 [root@hdss7-21 ~]# kubectl delete pod nginx-ds-spdgm pod "nginx-ds-spdgm" deleted [root@hdss7-21 ~]# kubectl delete pod nginx-ds-sx7hn pod "nginx-ds-sx7hn" deleted [root@hdss7-21 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-ds-97b8r 1/1 Running 0 12s 172.7.21.2 hdss7-21.host.com nginx-ds-ncpk8 1/1 Running 0 4s 172.7.22.2 hdss7-22.host.com
14.2 優化前訪問
進入172.7.21.2容器對172.7.22.2容器進行訪問,可能會出現容器無法啟動的情況(由于容器使用的dns是192.168.0.2,后面部署好dns就沒問題了),那么換成原來的容器,參看本章16節的內容,使容器可以ping通外網后安裝curl工具就行了
查看當前的pod信息 ~]# kubectl get pod -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-ds-nqb57 1/1 Running 0 22m 172.7.21.2 hdss7-21.host.comnginx-ds-pmgm2 1/1 Running 0 22m 172.7.22.2 hdss7-22.host.com 進入172.7.21.2容器訪問172.7.22.2容器,查看172.7.22.2的日志,會發現源地址是172.7.21.2的宿主機即10.4.7.21訪問的; [root@hdss7-21 ~]# kubectl exec -it nginx-ds-nqb57 bash root@nginx-ds-nqb57:/# curl 172.7.22.2 [root@hdss7-21 ~]# kubectl logs -f nginx-ds-pmgm2 10.4.7.21 - - [17/Jun/2021:13:22:10 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.38.0" "-"
14.3 開始優化iptables規則
以21為例,22網絡地址不同
[root@hdss7-21 ~]# yum install iptables-services -y [root@hdss7-21 ~]# systemctl start iptables && systemctl enable iptables [root@hdss7-21 ~]# iptables-save | grep -i postrouting :POSTROUTING ACCEPT [1:63] :KUBE-POSTROUTING - [0:0] -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING -A POSTROUTING -s 172.7.21.0/24 ! -o docker0 -j MASQUERADE -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE [root@hdss7-21 ~]# iptables -t nat -D POSTROUTING -s 172.7.21.0/24 ! -o docker0 -j MASQUERADE [root@hdss7-21 ~]# iptables -t nat -I POSTROUTING -s 172.7.21.0/24 ! -d 172.7.0.0/16 ! -o docker0 -j MASQUERADE [root@hdss7-21 ~]# iptables-save | grep -i postrouting :POSTROUTING ACCEPT [53:3186] :KUBE-POSTROUTING - [0:0] -A POSTROUTING -s 172.7.21.0/24 ! -d 172.7.0.0/16 ! -o docker0 -j MASQUERADE -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE 將iptables自帶的拒絕規則刪除 [root@hdss7-21 ~]# iptables-save | grep -i reject -A INPUT -j REJECT --reject-with icmp-host-prohibited -A FORWARD -j REJECT --reject-with icmp-host-prohibited [root@hdss7-21 ~]# iptables -t filter -D INPUT -j REJECT --reject-with icmp-host-prohibited [root@hdss7-21 ~]# iptables -t filter -D FORWARD -j REJECT --reject-with icmp-host-prohibited
15、各運算節點保存iptables規則
15.1 保存規則
[root@hdss7-21 ~]# iptables-save > /etc/sysconfig/iptables [root@hdss7-21 ~]# service iptables save iptables: Saving firewall rules to /etc/sysconfig/iptables:[ 確定 ] 再次嘗試容器之間訪問 [root@hdss7-21 ~]# kubectl exec -it nginx-ds-nqb57 bash root@nginx-ds-nqb57:/# curl 172.7.22.2 [root@hdss7-21 ~]# kubectl logs -f nginx-ds-pmgm2 10.4.7.21 - - [17/Jun/2021:13:16:55 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.38.0" "-" 10.4.7.21 - - [17/Jun/2021:13:22:10 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.38.0" "-" 172.7.21.2 - - [17/Jun/2021:13:33:07 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.38.0" "-"
15.2 重啟docker服務
注意:修改后會影響到docker原本的iptables鏈的規則,所以需要重啟docker服務
docker重啟后之前刪除的nat規則會再次生效,filter不會
[root@hdss7-21 ~]# systemctl restart docker [root@hdss7-21 ~]# iptables-save |grep -i postrouting|grep docker0 -A POSTROUTING -s 172.7.21.0/24 ! -o docker0 -j MASQUERADE -A POSTROUTING -s 172.7.21.0/24 ! -d 172.7.0.0/16 ! -o docker0 -j MASQUERADE 可以用iptables-restore重新應用iptables規則,也可以直接再刪 [root@hdss7-21 ~]# iptables-restore /etc/sysconfig/iptables [root@hdss7-21 ~]# iptables-save |grep -i postrouting|grep docker0 -A POSTROUTING -s 172.7.21.0/24 ! -d 172.7.0.0/16 ! -o docker0 -j MASQUERADE
16、報錯排查
當你發現使用kubectl啟動的容器無法ping通外網時,嘗試使用本地的docker通過同一個鏡像創建一個容器,如果此容器正常訪問外網,那么可以說明宿主機沒有問題,docker沒有問題(如果有問題,網上的一些說法是重啟一下docker),那么可能就是kubectl的配置有問題,比較一下兩種容器的dns,發現不一樣;
Docker啟動的容器如下: ~]# docker exec -it f2be1d19ff19 bash root@f2be1d19ff19:/# cat /etc/resolv.conf # Generated by NetworkManager search host.com nameserver 10.4.7.11 root@f2be1d19ff19:/# ping baidu.com PING baidu.com (39.156.69.79): 48 data bytes 56 bytes from 39.156.69.79: icmp_seq=0 ttl=53 time=10.950 ms 使用kubectl啟動的容器如下: ~]# kubectl exec -it nginx-ds-gmhm7 bash root@nginx-ds-gmhm7:/# cat /etc/resolv.conf nameserver 192.168.0.2 search default.svc.cluster.local svc.cluster.local cluster.local host.com options ndots:5 root@nginx-ds-gmhm7:/# ping baidu.com ping: unknown host
發現,容器的ip地址和dns都不在同一個網段,將dns修改成正常docker啟動的dns就行了
root@nginx-ds-gmhm7:/# echo "nameserver 10.4.7.11" > /etc/resolv.conf root@nginx-ds-gmhm7:/# ping baidu.com PING baidu.com (220.181.38.148): 48 data bytes 56 bytes from 220.181.38.148: icmp_seq=0 ttl=49 time=15.832 ms root@nginx-ds-gmhm7:/# tee /etc/apt/sources.list << EOF deb http://mirrors.163.com/debian/ jessie main non-free contrib deb http://mirrors.163.com/debian/ jessie-updates main non-free contrib EOF root@nginx-ds-gmhm7:/# apt-get update && apt-get install curl -y 嘗試curl百度首頁 root@cc8ae2b47946:/# curl -k https://www.baidu.com
關于容器dns為192.168.0.2,無法訪問外網,這個問題在后面部署好dns后就可以解決了
鏈接:https://www.cnblogs.com/wangyuanguang/p/15024812.html
-
網絡
+關注
關注
14文章
7571瀏覽量
88897 -
模型
+關注
關注
1文章
3254瀏覽量
48889 -
插件
+關注
關注
0文章
331瀏覽量
22447 -
kubernetes
+關注
關注
0文章
225瀏覽量
8723
原文標題:16、報錯排查
文章出處:【微信號:magedu-Linux,微信公眾號:馬哥Linux運維】歡迎添加關注!文章轉載請注明出處。
發布評論請先 登錄
相關推薦
評論