二进制 部署Kubernetes集群
组件版本
一、Kubernetes的基本知识

-
Pod Pod是若干个相关容器的组合,是一个逻辑概念,Pod包含的容器运行在同一个宿主机上,这些容器使用相同的网络命名空间、IP地址和端口,相互之间能通过localhost来发现和通信,共享一块存储卷空间。在Kubernetes中创建、调度和管理的最小单位是Pod。一个Pod一般只放一个业务容器和一个用于统一网络管理的网络容器。 -
Replication Controller Replication Controller是用来控制管理Pod副本(Replica,或者称实例),Replication Controller确保任何时候Kubernetes集群中有指定数量的Pod副本在运行,如果少于指定数量的Pod副本,Replication Controller会启动新的Pod副本,反之会杀死多余的以保证数量不变。另外Replication Controller是弹性伸缩、滚动升级的实现核心。 -
Service Service是真实应用服务的抽象,定义了Pod的逻辑集合和访问这个Pod集合的策略,Service将代理Pod对外表现为一个单一访问接口,外部不需要了解后端Pod如何运行,这给扩展或维护带来很大的好处,提供了一套简化的服务代理和发现机制。 -
Label Label是用于区分Pod、Service、Replication Controller的Key/Value键值对,实际上Kubernetes中的任意API对象都可以通过Label进行标识。每个API对象可以有多个Label,但是每个Label的Key只能对应一个Value。Label是Service和Replication Controller运行的基础,它们都通过Label来关联Pod,相比于强绑定模型,这是一种非常好的松耦合关系。 -
Node Kubernets属于主从的分布式集群架构,Kubernets Node(简称为Node,早期版本叫做Minion)运行并管理容器。Node作为Kubernetes的操作单元,将用来分配给Pod(或者说容器)进行绑定,Pod最终运行在Node上,Node可以认为是Pod的宿主机。
需求
目前生产部署Kubernetes集群主要有两种方式:
kubeadm
- Kubeadm是一个K8s部署工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。
二进制包
1. 安装要求
部署Kubernetes集群服务器需要满足以下几个条件:
- 一台或多台机器,操作系统 CentOS7.x-86_x64
- 硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
- 集群中所有机器之间网络互通
- 可以访问外网,需要拉取镜像
- 禁止swap分区
服务器整体规划:
| 角色 | IP | 组件 |
|---|
| k8s-master1 | 10.10.3.139 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd | | k8s-master2 | 10.10.3.174 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd) | | k8s-master3 | 10.10.3.185 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd | | k8s-node1 | 10.10.3.179 | kubelet,kube-proxy,docker | | k8s-node2 | 10.10.3.197 | kubelet,kube-proxy,docker | | Load Balancer(Master) | 10.10.3.100,10.10.3.101 (VIP) | Nginx L4 haproxy keepalived ipvs | | Load Balancer(Backup) | 10.10.3.200 | Nginx L4 haproxy keepalived ipvs |
须知:这一套高可用集群分两部分实施,先部署一套单Master架构,再扩容为多Master架构(上述规划)。
API Server 高可用通过,可以通过 Nginx + Keepalived,或者 HaProxy + Keepalived等实现。
服务版本与K8S集群说明
- 所有
主机使用 CentOS 7.8.2003 版本,并且内核都升到5.x版本(内核非必须,我没有升级还是3.10的内核)。 - K8S 集群使用
Iptables 模式(kube-proxy 注释中预留 Ipvs 模式配置) - Calico 使用
IPIP 模式 - 集群使用默认
svc.cluster.local 改成svc.superred.com 10.10.0.1 为集群 kubernetes svc 解析ip- Docker CE version 19.03.9
- Kubernetes Version 1.18.8
- Etcd Version v3.4.9
- Calico Version v3.14.0
- Coredns Version 1.7.0
- Metrics-Server Version v0.3.7
Service 和 Pods Ip 段划分
| 名称 | IP网段 | 备注 |
|---|
| service-cluster-ip-range|是k8s svc的ip CIDR格式 | 10.0.0.0/24 | 可用地址 65534,Service(服务)网段(和微服务架构有关) | | pods-ip|cluster-cidr | [ clusterCIDR: 10.244.0.0/16(kube-proxy-config.yml)] CIDR格式 | 10.244.0.0/16 | 可用地址 65534,Pod 中间网络通讯我们用flannel,flannel要求是10.244.0.0/16,这个IP段就是Pod的IP段 | | 集群dns和是svc的ip一个网段 | 10.0.0.2 | 用于集群service域名解析 | | service/kubernetes default的ip | 10.0.0.1 | 集群 kubernetes svc 解析ip |
单Master节点服务器规划:
| 角色 | IP | 组件 |
|---|
| k8s-master | 10.10.3.139 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd | | k8s-node1 | 10.10.3.179 | kubelet,kube-proxy,docker etcd | | k8s-node2 | 10.10.3.197 | kubelet,kube-proxy,docker,etcd |

- 注意:同时需确保MAC和product_uuid唯一(参考下面的命令查看)
root@k8s-master:~ # ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:ab:2d:cc brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:0f:73:13:e7 brd ff:ff:ff:ff:ff:ff
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 86:7b:76:32:21:f5 brd ff:ff:ff:ff:ff:ff
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 1a:99:c0:9e:af:82 brd ff:ff:ff:ff:ff:ff
12: veth367dff5f@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
link/ether 16:0e:e6:ae:71:08 brd ff:ff:ff:ff:ff:ff link-netnsid 0
root@k8s-master:~ # cat /sys/class/dmi/id/product_uuid
BA2F3796-00F0-49CD-876F-708CC1DD674E
2. 系统初始化配置
以下命令在三台主机上均需运行
# 关闭防火墙
systemctl stop firewalld && systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
# 关闭swap
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab && swapoff -a
#添加仓库
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
或者使用阿里云镜像:
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-
ce/linux/centos/docker-ce.repo
# 根据规划设置主机名
hostnamectl set-hostname < name > (可不做)
# 在master添加hosts (可不做)
cat >> /etc/hosts << EOF
10.10.3.139 k8s-master
10.10.3.179 k8s-node1
10.10.3.197 k8s-node2
EOF
#加载内核模块
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
modprobe -- br_netfilter
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules
# 设置内核参数,将桥接的IPv4流量传递到iptables的链
cat << EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
sysctl --system # 生效
sysctl -p /etc/sysctl.d/k8s.conf
# 时间同步
yum install ntpdate -y
ntpdate time.windows.com
或者
#时间同步
yum install chrony -y && systemctl enable chronyd && systemctl start chronyd
timedatectl set-timezone Asia/Shanghai && timedatectl set-ntp yes
#加载内核模块
- modprobe ip_vs lvs基于4层的负载均很
- modprobe ip_vs_rr 轮询
- modprobe ip_vs_wrr 加权轮询
- modprobe ip_vs_sh 源地址散列调度算法
- modprobe nf_conntrack_ipv4 连接跟踪模块
- modprobe br_netfilter 遍历桥的数据包由iptables进行处理以进行过滤和端口转发
#设置内核参数
- overcommit_memory是一个内核对内存分配的一种策略,取值又三种分别为0, 1, 2
- overcommit_memory=0, 表示内核将检查是否有足够的可用内存供应用进程使用;如果有足够的可用内存,内存申请允许;否则,内存申请失败,并把错误返回给应用进程。
- overcommit_memory=1, 表示内核允许分配所有的物理内存,而不管当前的内存状态如何。
- overcommit_memory=2, 表示内核允许分配超过所有物理内存和交换空间总和的内存
- net.bridge.bridge-nf-call-iptables 设置网桥iptables网络过滤通告
- net.ipv4.tcp_tw_recycle 设置 IP_TW 回收
- vm.swappiness 禁用swap
- vm.panic_on_oom 设置系统oom(内存溢出)
- fs.inotify.max_user_watches 允许用户最大监控目录数
- fs.file-max 允许系统打开的最大文件数
- fs.nr_open 允许单个进程打开的最大文件数
- net.ipv6.conf.all.disable_ipv6 禁用ipv6
- net.netfilter.nf_conntrack_max 系统的最大连接数
| 组件 | 使用的证书 |
|---|
| etcd | ca.pem,etcd.pem,etcd-key.pem | | flannel | ca.pem,flannel.pem,flannel-key.pem | | kube-apiserver | ca.pem,apiserver.pem,apiserver-key.pem | | kubelet(自动颁发) | ca.pem,ca-key.pem | | kube-proxy | ca.pem,kube-proxy.pem,kube-proxy-key.pem | | kubectl | ca.pem,admin.pem,admin-key.pem |
二、部署Etcd集群
Etcd 是一个分布式键值存储系统,Kubernetes使用Etcd进行数据存储,所以先准备一个Etcd数据库,为解决Etcd单点故障,应采用集群方式部署,这里使用3台组建集群,可容忍1台机器故障,当然,你也可以使用5台组建集群,可容忍2台机器故障。
| 节点名称 | IP |
|---|
| etcd-1 (k8s-master) | 10.10.3.139 | | etcd-2 (k8s-node1) | 10.10.3.179 | | etcd-3 (k8s-node2) | 10.10.3.197 |
注:为了节省机器,这里与K8s节点机器复用。也可以独立于k8s集群之外部署,只要api-server能连接到就行。
2.1 准备cfssl证书生成工具
cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。这里用Master节点
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
2.2 生成Etcd证书
2.2.1 自签证书颁发机构(CA)在k8s-master节点操作
# 创建工作目录
mkdir -p ~/TLS/{etcd,k8s}
cd TLS/etcd
# 自签CA
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
cat > ca-csr.json << EOF
{
"CN": "etcd CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing"
}
]
}
EOF
# 生成证书
root@k8s-master:~/TLS # cfssl gencert -initca ca-csr.json | cfssljson -bare ca
2020/08/27 14:25:14 [INFO] generating a new CA key and certificate from CSR
2020/08/27 14:25:14 [INFO] generate received request
2020/08/27 14:25:14 [INFO] received CSR
2020/08/27 14:25:14 [INFO] generating key: rsa-2048
2020/08/27 14:25:14 [INFO] encoded CSR
2020/08/27 14:25:14 [INFO] signed certificate with serial number 725281651325617718681661325863333735086520188695
2.2.2 使用自签CA签发Etcd HTTPS证书
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"10.10.3.139",
"10.10.3.179",
"10.10.3.197"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing"
}
]
}
EOF
可以使用openssl命令去检查我们签发的证书是否正确
查看秘钥:
openssl x509 -in ca-key.key -noout -text
openssl x509 -noout -text -in ca.crt
openssl req -noout -text -in ca.csr
注:文件hosts字段中IP为所有etcd节点的集群内部通信IP,一个都不能少!为了方便后期扩容也可以多写几个预留的IP。
# 生成证书
root@k8s-master:~/TLS # cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd 1
2020/08/27 14:28:46 [INFO] generate received request
2020/08/27 14:28:46 [INFO] received CSR
2020/08/27 14:28:46 [INFO] generating key: rsa-2048
2020/08/27 14:28:47 [INFO] encoded CSR
2020/08/27 14:28:47 [INFO] signed certificate with serial number 316290753253162786864679339793041040918888860541
2020/08/27 14:28:47 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
root@k8s-master:~/TLS # ls etcd*pem
etcd-key.pem etcd.pem
2.3 从Github下载Etcd二进制文件
# 下载地址
#官网写了,因为新版3.4以上版本不支持旧版API
wget https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz
2.4 部署Etcd集群
在master节点上操作,为简化操作,可以将master节点生成的所有文件拷贝到俩台node节点
2.4.1 创建目录并解压二进制包
mkdir /opt/etcd/{bin,cfg,ssl} -p
tar zxvf etcd-v3.4.9-linux-amd64.tar.gz
chown -R root:root etcd-v3.4.9-linux-amd64
mv etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
2.4.2 创建etcd配置文件
cat > /opt/etcd/cfg/etcd.conf << EOF
#[Member]
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.10.3.139:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.10.3.139:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.3.139:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.3.139:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://10.10.3.139:2380,etcd-2=https://10.10.3.179:2380,etcd-3=https://10.10.3.197:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
- ETCD_NAME:节点名称,集群中唯一
- ETCD_DATA_DIR:数据目录
- ETCD_LISTEN_PEER_URLS:集群通信监听地址
- ETCD_LISTEN_CLIENT_URLS:客户端访问监听地址
- ETCD_INITIAL_ADVERTISE_PEER_URLS:集群通告地址
- ETCD_ADVERTISE_CLIENT_URLS:客户端通告地址
- ETCD_INITIAL_CLUSTER:集群节点地址
- ETCD_INITIAL_CLUSTER_TOKEN:集群Token
- ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群
2.4.3 配置systemd管理etcd
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \\
--cert-file=/opt/etcd/ssl/etcd.pem \\
--key-file=/opt/etcd/ssl/etcd-key.pem \\
--peer-cert-file=/opt/etcd/ssl/etcd.pem \\
--peer-key-file=/opt/etcd/ssl/etcd-key.pem \\
--trusted-ca-file=/opt/etcd/ssl/ca.pem \\
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \\
--logger=zap
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
2.4.3 拷贝生成的证书
把刚才生成的证书拷贝到配置文件中的路径
cp ~/TLS/etcd/ca*pem ~/TLS/etcd/etcd*pem /opt/etcd/ssl/
2.4.5 启动并设置自启动
systemctl daemon-reload
systemctl restart etcd
systemctl enable etcd
# 如果报错无法启动,则需要将其他etcd节点设置完成后才可以启动
2.4.6 将master节点生成所有的文件拷贝到两台node节点
root@k8s-master:~/TLS # scp -rp /opt/etcd 10.10.3.179:/opt/
root@k8s-master:~/TLS # scp -rp /opt/etcd 10.10.3.197:/opt/
root@k8s-master:~/TLS # scp -rp /usr/lib/systemd/system/etcd.service 10.10.3.179:/usr/lib/systemd/system/
etcd.service
root@k8s-master:~/TLS # scp -rp /usr/lib/systemd/system/etcd.service 10.10.3.197:/usr/lib/systemd/system/
2.4.7 在node节点分别修改etcd.conf配置文件中的节点名称和当前服务器IP
[root@localhost etcd]# cat cfg/etcd.conf
#[Member]
ETCD_NAME="etcd-2" #修改此处,节点2改为etcd-2,节点3改为etcd-3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.10.3.179:2380" # 修改此处为当前服务器IP
ETCD_LISTEN_CLIENT_URLS="https://10.10.3.179:2379" # 修改此处为当前服务器IP
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.3.179:2380" # 修改此处为当前服务器IP
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.3.179:2379" # 修改此处为当前服务器IP
ETCD_INITIAL_CLUSTER="etcd-1=https://10.10.3.139:2380,etcd-2=https://10.10.3.179:2380,etcd-3=https://10.10.3.197:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
# 最后启动etcd并设置开机启动
systemctl daemon-reload
systemctl restart etcd
systemctl enable etcd
[root@localhost etcd]# systemctl status -ll etcd
● etcd.service - Etcd Server
Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 14:49:59 CST; 13s ago
Main PID: 17518 (etcd)
Tasks: 20
Memory: 33.8M
CGroup: /system.slice/etcd.service
└─17518 /opt/etcd/bin/etcd --cert-file=/opt/etcd/ssl/etcd.pem --key-file=/opt/etcd/ssl/etcd-key.pem --peer-cert-file=/opt/etcd/ssl/etcd.pem --peer-key-file=/opt/etcd/ssl/etcd-key.pem --trusted-ca-file=/opt/etcd/ssl/ca.pem --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem --logger=zap
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.815+0800","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"627ba52be9b69b8c","rtt":"0s","error":"dial tcp 10.10.3.179:2380: connect: connection refused"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:250","msg":"set message encoder","from":"a03c2a74d8b05a51","to":"a03c2a74d8b05a51","stream-type":"stream Message"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/peer_status.go:51","msg":"peer became active","peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:266","msg":"closed TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:277","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:250","msg":"set message encoder","from":"a03c2a74d8b05a51","to":"a03c2a74d8b05a51","stream-type":"stream MsgApp v2"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:266","msg":"closed TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:277","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.838+0800","caller":"rafthttp/stream.go:425","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.839+0800","caller":"rafthttp/stream.go:425","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream Message","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
[root@localhost etcd]# netstat -lntup|grep etcd
tcp 0 0 10.10.3.197:2379 0.0.0.0:* LISTEN 17518/etcd
tcp 0 0 10.10.3.197:2380 0.0.0.0:* LISTEN 17518/etcd
#为三台节点都创建系统执行指令命令
cp /opt/etcd/bin/etcdctl /usr/bin/
2.4.8 查看etcd集群状态
root@k8s-master:~/TLS # etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379" endpoint health
https://10.10.3.197:2379 is healthy: successfully committed proposal: took = 22.070915ms
https://10.10.3.139:2379 is healthy: successfully committed proposal: took = 25.085954ms
https://10.10.3.179:2379 is healthy: successfully committed proposal: took = 25.915676ms
如果输出上面信息,就说明Etcd集群部署成功。如果有问题先看日志:/var/log/message或 journalctl -u etcd
三、所有节点部署Docker

我这里yum安装或用二进制安装都可以
yum 安装
# 1. 卸载旧版本
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
# 2. 使用存储库安装
yum install -y yum-utils
# 3. 设置镜像仓库(修改为国内源地址)
yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 5. 更新索引
yum makecache fast
# 4. 安装docker相关的依赖 默认最新版(docker-ce:社区版 ee:企业版)
yum install docker-ce docker-ce-cli containerd.io -y
#5. 安装特定docker版本(先列出列出可用版本)
yum list docker-ce --showduplicates | sort -r
yum install docker-ce-19.03.9 docker-ce-cli-19.03.9 containerd.io
containerd.io:于os分开
docker-ce-cli:客户端(cli)
docker-ce:docker后端主程序
# 所有节点设置防火墙规则,并让生效
vim /lib/systemd/system/docker.service
[Service]
#ExecStartPost=/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
ExecStartPost=/sbin/iptables -P FORWARD ACCEPT
# 7. 启动docker
systemctl enable docker
systemctl restart docker
# 8. 查看版本
[root@k8s-master ~]# docker --version
Docker version 19.03.9
# 9. 配置docker镜像加速器
## 镜像加速器:阿里云加速器,daocloud加速器,中科大加速器
## Docker 中国官方镜像加速:https://registry.docker-cn.com
mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://registry.docker-cn.com"],
"insecure-registries": ["10.10.3.104"],
"insecure-registries": ["harbor.superred.com"]
}
EOF
systemctl daemon-reload
systemctl restart docker
二进制方式
下载地址 https://download.docker.com/linux/static/stable/x86_64/docker-19.03.9.tgz
解压二进制包
tar zxvf docker-19.03.9.tgz
mv docker/* /usr/bin
systemd管理docker
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
EOF
创建配置文件
mkdir /etc/docker
cat > /etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://registry.docker-cn.com"],
"insecure-registries": ["10.10.3.104"],
"insecure-registries": ["harbor.superred.com"]
}
EOF
启动并设置开机启动
systemctl daemon-reload
systemctl start docker
systemctl enable docker
四、部署Master节点
-
kube-apiserver, -
kube-controller-manager, -
kube-scheduler
4.1 生成kube-apiserver证书
4.1.1 自签CA证书颁发机构
cd /root/TLS/k8s
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
cat > ca-csr.json << EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
root@k8s-master:~/TLS/k8s # cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
2020/08/27 15:05:30 [INFO] generating a new CA key and certificate from CSR
2020/08/27 15:05:30 [INFO] generate received request
2020/08/27 15:05:30 [INFO] received CSR
2020/08/27 15:05:30 [INFO] generating key: rsa-2048
2020/08/27 15:05:31 [INFO] encoded CSR
2020/08/27 15:05:31 [INFO] signed certificate with serial number 543074556734204684883324478113935421961362062670
root@k8s-master:~/TLS/k8s # ll *pem
-rw------- 1 root root 1.7K Aug 27 15:05 ca-key.pem
-rw-r--r-- 1 root root 1.4K Aug 27 15:05 ca.pem
4.1.2 使用自签CA签发kube-apiserver HTTPS证书
cat > apiserver-csr.json << EOF
{
"CN": "kubernetes",
"hosts": [
"10.244.0.0",
"10.244.0.1",
"10.0.0.0",
"10.0.0.1",
"10.0.0.2",
"127.0.0.1",
"10.10.3.100",
"10.10.3.101",
"10.10.3.200",
"10.10.3.139",
"10.10.3.179",
"10.10.3.197",
"10.10.3.174",
"10.10.3.185",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.superred",
"kubernetes.default.svc.superred.com"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
#上述文件hosts字段中IP为所有Master/LB/VIP IP,一个都不能少!为了方便后期扩容可以多写几个预留的IP
# 生成证书
root@k8s-master:~/TLS/k8s # ls 1
apiserver-csr.json ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
root@k8s-master:~/TLS/k8s # cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver
2020/08/27 15:12:39 [INFO] generate received request
2020/08/27 15:12:39 [INFO] received CSR
2020/08/27 15:12:39 [INFO] generating key: rsa-2048
2020/08/27 15:12:39 [INFO] encoded CSR
2020/08/27 15:12:39 [INFO] signed certificate with serial number 113880470106589986816895015180975886983725249312
2020/08/27 15:12:39 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
root@k8s-master:~/TLS/k8s # ll apiserver*pem
-rw------- 1 root root 1.7K Aug 27 15:12 apiserver-key.pem
-rw-r--r-- 1 root root 1.7K Aug 27 15:12 apiserver.pem
4.2下载二进制文件包并解压https://github.com/kubernetes/kubernetes/releases/download/v1.18.8/kubernetes.tar.gz
mkdir -p /opt/kubernetes/{bin,cfg,ssl,logs}
1 cd kubernetes/cluster
2 ./get-kube-binaries.sh 下载二进制文件 到kubernetes/server/kubernetes-server-linux-amd64.tar.gz
3 tar zxvf kubernetes-server-linux-amd64.tar.gz
root@k8s-master:~/work/k8s-1.18.8/kubernetes/server/kubernetes/server/bin # ls apiextensions-apiserver kube-apiserver.docker_tag kube-controller-manager.docker_tag kubelet kube-proxy.tar kube-scheduler.tar
kubeadm kube-apiserver.tar kube-controller-manager.tar kube-proxy kube-scheduler mounter
kube-apiserver kube-controller-manager kubectl kube-proxy.docker_tag kube-scheduler.docker_tag
4 cp kube-apiserver kube-controller-manager kubectl kubelet kube-proxy kube-scheduler /opt/kubernetes/bin
4.3 部署kube-apiserver
4.3.1. 创建配置文件
cat > /opt/kubernetes/cfg/kube-apiserver.conf << EOF
KUBE_APISERVER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--etcd-servers=https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379 \\
--bind-address=10.10.3.139 \\
--secure-port=6443 \\
--advertise-address=10.10.3.139 \\
--allow-privileged=true \\
--service-cluster-ip-range=10.0.0.0/24 \\
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \\
--authorization-mode=RBAC,Node \\
--enable-bootstrap-token-auth=true \\
--token-auth-file=/opt/kubernetes/cfg/token.csv \\
--service-node-port-range=1024-65535 \\
--kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem \\
--kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem \\
--tls-cert-file=/opt/kubernetes/ssl/apiserver.pem \\
--tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\
--client-ca-file=/opt/kubernetes/ssl/ca.pem \\
--service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \\
--etcd-cafile=/opt/etcd/ssl/ca.pem \\
--etcd-certfile=/opt/etcd/ssl/etcd.pem \\
--etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--audit-log-path=/opt/kubernetes/logs/k8s-audit.log"
EOF
注:上面两个\\ 第一个是转义符,第二个是换行符,使用转义符是为了使用EOF保留换行符。
--logtostderr:启用日志
--log-dir:日志目录
--v:日志等级,越小越多
--etcd-servers: etcd集群地址
--bind-address :监听地址
--secure-port:https安全端口
--advertise-address:集群通告地址
--allow-privileged:启用授权
--service-cluster-ip-range:Service虚拟IP地址段
--enable-admission-plugins:准入控制模块,决定是否启用k8s高级功能
--authorization-mode:认证授权,启用RBAC授权和节点自管理
--enable-bootstrap-token-auth:启用TLS bootstrap机制
--token-auth-file:bootstrap token文件
--service-node-port-range:Service nodeport类型默认分配端口范围
--kubelet-https:apiserver主动访问kubectl时默认使用https
--kubelet-client-xxx:apiserver访问kubelet客户端证书
--tls-xxx-file:apiserver https证书
--etcd-xxxfile:连接Etcd集群证书
--audit-log-xxx:审计日志
启用 TLS Bootstrapping 机制
TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy要与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。

4.3.2 拷贝刚才生成的证书
root@k8s-master:/etc/kubernetes/back/1.18config # cp ~/TLS/k8s/ca*pem ~/TLS/k8s/apiserver*pem /opt/kubernetes/ssl/
root@k8s-master:/etc/kubernetes/back/1.18config # ls /opt/kubernetes/ssl
apiserver-key.pem apiserver.pem ca-key.pem ca.pem
4.3.3 创建配置文件中的token文件
# 获取16位token随机值的命令
head -c 16 /dev/urandom | od -An -t x | tr -d ' '
863b2ebebecffbb3a6493ff15dfc57c6
# 添加token文件(格式:token,用户名,UID,用户组)
BOOTSTRAP_TOKEN=863b2ebebecffbb3a6493ff15dfc57c6
cat > /opt/kubernetes/cfg/token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:node-bootstrapper"
EOF
4.3.4 配置systemd启动apiserver
cat > /usr/lib/systemd/system/kube-apiserver.service << EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-apiserver.conf
ExecStart=/opt/kubernetes/bin/kube-apiserver \$KUBE_APISERVER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
4.3.5 启动并设置开机启动
systemctl daemon-reload
systemctl restart kube-apiserver
systemctl enable kube-apiserver
root@localhost:/opt/kubernetes/cfg # systemctl status -ll kube-apiserver 1
● kube-apiserver.service - Kubernetes API Server
Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 16:06:39 CST; 25s ago
Docs: https://github.com/kubernetes/kubernetes
Main PID: 3785 (kube-apiserver)
Tasks: 22
Memory: 304.5M
CGroup: /system.slice/kube-apiserver.service
└─3785 /opt/kubernetes/bin/kube-apiserver --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --etcd-servers=https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379 --bind-address=10.10.3.139 --secure-port=6443 --advertise-address=10.10.3.139 --allow-privileged=true --service-cluster-ip-range=10.10.0.0/24 --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction --authorization-mode=RBAC,Node --enable-bootstrap-token-auth=true --token-auth-file=/opt/kubernetes/cfg/token.csv --service-node-port-range=1024-65535 --kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem --kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem --tls-cert-file=/opt/kubernetes/ssl/apiserver.pem --tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem --client-ca-file=/opt/kubernetes/ssl/ca.pem --service-account-key-file=/opt/kubernetes/ssl/ca-key.pem --etcd-cafile=/opt/etcd/ssl/ca.pem --etcd-certfile=/opt/etcd/ssl/etcd.pem --etcd-keyfile=/opt/etcd/ssl/etcd-key.pem --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100 --audit-log-path=/opt/kubernetes/logs/k8s-audit.log
Aug 27 16:06:39 localhost.localdomain systemd[1]: Started Kubernetes API Server
4.3.6 授权kubelet-bootstrapper用户允许请求证书
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
4.4 部署controller-manager
4.4.1 创建配置文件
cat > /opt/kubernetes/cfg/kube-controller-manager.conf << EOF
KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--leader-elect=true \\
--master=127.0.0.1:8080 \\
--bind-address=127.0.0.1 \\
--allocate-node-cidrs=true \\
--cluster-cidr=10.244.0.1/16 \\
--service-cluster-ip-range=10.0.0.0/24 \\
--cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem \\
--cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem \\
--root-ca-file=/opt/kubernetes/ssl/ca.pem \\
--service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem \\
--experimental-cluster-signing-duration=87600h0m0s"
EOF
- –master:通过本地非安全本地端口8080连接apiserver
- –leader-elect:当该组件启动多个时,自动选举(HA)
- –cluster-signing-cert-file / –cluster-signing-key-file:自动为kubelet颁发证书的CA,与apiserver保持一致
4.4.2 配置systemd启动controller-manager
cat > /usr/lib/systemd/system/kube-controller-manager.service << EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-controller-manager.conf
ExecStart=/opt/kubernetes/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
4.4.3 启动并设置开机启动
systemctl daemon-reload
systemctl restart kube-controller-manager
systemctl enable kube-controller-manager
4.5 部署kube-scheduler
4.5.1 创建配置文件
cat > /opt/kubernetes/cfg/kube-scheduler.conf << EOF
KUBE_SCHEDULER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--leader-elect \\
--master=127.0.0.1:8080 \\
--bind-address=127.0.0.1"
EOF
- –master:通过本地非安全本地端口8080连接apiserver。
- –leader-elect:当该组件启动多个时,自动选举(HA)
cat > /usr/lib/systemd/system/kube-scheduler.service << EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-scheduler.conf
ExecStart=/opt/kubernetes/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF 4.6 查看集群状态 所有组件都已经启动成功,通过kubectl工具查看当前集群组件状态 root@localhost:/opt/kubernetes/cfg # kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
输出如上内容说明Master节点组件运行正常
五、部署Node节点
5.1 创建工作目录并拷贝二进制文件
如果想把Master也当做Node节点的话,也可以在Master节点上安装kubelet和kube-proxy
# 1.在所有node节点创建工作目录
mkdir -p /opt/kubernetes/{bin,cfg,ssl,logs}
# 2. 从master节点上解压的kubernetes压缩包中拷贝文件到所有node节点
for ip in 179 197;do scp -rp ./kubernetes/server/bin/{kubelet,kube-proxy} 10.10.3.$ip:/opt/kubernetes/bin/;done
5.2 部署kubelet
5.2.1 创建配置文件
pod的基础容器镜像改为国内的下载的镜像地址或自己的harbor
harbor.superred.com/kubernetes/pause-amd64:3.0
# node1节点
cat > /opt/kubernetes/cfg/kubelet.conf << EOF
KUBELET_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--hostname-override=10.10.3.179 \\
--network-plugin=cni \\
--kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \\
--config=/opt/kubernetes/cfg/kubelet-config.yml \\
--cert-dir=/opt/kubernetes/ssl \\
--pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0"
EOF
# node2节点
cat > /opt/kubernetes/cfg/kubelet.conf << EOF
KUBELET_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--hostname-override=10.10.3.197 \\
--network-plugin=cni \\
--kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \\
--config=/opt/kubernetes/cfg/kubelet-config.yml \\
--cert-dir=/opt/kubernetes/ssl \\
--pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0"
EOF
- –hostname-override:显示主机名称,集群中唯一
- –network-plugin:启用CNI
- –kubeconfig:空路径,会自动生成,后面用于连接apiserver
- –bootstrap-kubeconfig:首次启动向apiserver申请证书
- –config:配置参数文件
- –cert-dir:kubelet证书生成目录
- –pod-infra-container-image:管理Pod网络容器的镜像,用于实现Kubernetes集群里pod之间的网络通讯
5.2.2 配置参数文件
# node1节点和node2节点配置相同
cat > /opt/kubernetes/cfg/kubelet-config.yml << EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
cgroupDriver: cgroupfs
clusterDNS:
- 10.0.0.2
clusterDomain: superred.com
failSwapOn: false
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /opt/kubernetes/ssl/ca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
maxOpenFiles: 1000000
maxPods: 110
EOF
cgroupDriver: cgroupfs要和docker info一致才可以
5.2.3 生成bootstrap.kubeconfig文件
在k8s-master节点将node节点需要的CA证书文件拷贝过去
在k8s-master上面
root@localhost:/opt/kubernetes/cfg # scp /opt/kubernetes/ssl/ca.pem 10.10.3.179:/opt/kubernetes/ssl 130
ca.pem 100% 1359 656.7KB/s 00:00
root@localhost:/opt/kubernetes/cfg # scp /opt/kubernetes/ssl/ca.pem 10.10.3.197:/opt/kubernetes/ssl
在k8s-master上查看Token文件的随机值
[root@k8s-master ~]# cat /opt/kubernetes/cfg/token.csv
863b2ebebecffbb3a6493ff15dfc57c6,kubelet-bootstrap,10001,"system:kubelet-bootstrapper"
在k8s-master上生成bootstrap.kubeconfig文件
KUBE_APISERVER="https://10.10.3.139:6443" # apiserverIP:PORT
TOKEN="445e77448cab838fafee59d3fd2627e9" # 与token.csv里保持一致
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=bootstrap.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials "kubelet-bootstrap" \
--token=${TOKEN} \
--kubeconfig=bootstrap.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user="kubelet-bootstrap" \
--kubeconfig=bootstrap.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
# 保存到配置文件路径下
cp bootstrap.kubeconfig /opt/kubernetes/cfg/
# 拷贝到node节点的/opt/kubernetes/cfg/下
scp -rp /opt/kubernetes/cfg/bootstrap.kubeconfig 10.10.3.179:/opt/kubernetes/cfg
scp -rp /opt/kubernetes/cfg/bootstrap.kubeconfig 10.10.3.197:/opt/kubernetes/cfg
5.2.4 配置systemd管理kubelet 在k8s-node1 k8s-node2(留着后天添加也可以)节点上面执行
# node1节点和node2节点配置相同
cat > /usr/lib/systemd/system/kubelet.service << EOF
[Unit]
Description=Kubernetes Kubelet
After=docker.service
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kubelet.conf
ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
5.2.5 启动并设置开机启动
systemctl daemon-reload
systemctl restart kubelet
systemctl enable kubelet
[root@localhost cfg]# systemctl status -ll kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 17:02:54 CST; 7s ago
Main PID: 22236 (kubelet)
CGroup: /system.slice/kubelet.service
└─22236 /opt/kubernetes/bin/kubelet --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --hostname-override=10.10.3.179 --network-plugin=cni --kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig --bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig --config=/opt/kubernetes/cfg/kubelet-config.yml --cert-dir=/opt/kubernetes/ssl --pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0
Aug 27 17:02:54 localhost.localdomain systemd[1]: Started Kubernetes Kubelet.
5.2.6 批准kubelet证书申请并加入集群 在k8s-master执行(手动批准,后期要做成自动批准)
1 # 查看kubelet证书请求
root@localhost:/opt/kubernetes/cfg # kubectl get csr 1
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 72s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
2 # 批准请求
root@localhost:/opt/kubernetes/cfg # kubectl get csr 1
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 2m10s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
root@localhost:/opt/kubernetes/cfg # kubectl certificate approve node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8k
root@localhost:/opt/kubernetes/cfg # kubectl certificate approve node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8
certificatesigningrequest.certificates.k8s.io/node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 approved
root@localhost:/opt/kubernetes/cfg # kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 2m33s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
3 # 查看node节点状态
root@localhost:/opt/kubernetes/cfg # kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.10.3.179 NotReady <none> 35s v1.18.8
#@注:由于网络插件还没有部署,节点会没有准备就绪 NotReady
5.4 部署kube-proxy
5.4.1 创建配置文件
# node1节点和node2节点配置相同
cat > /opt/kubernetes/cfg/kube-proxy.conf << EOF
KUBE_PROXY_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--config=/opt/kubernetes/cfg/kube-proxy-config.yml"
EOF
5.4.2 配置参数文件
# node1
cat > /opt/kubernetes/cfg/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 10.10.3.179
clusterCIDR: 10.244.0.0/16
EOF
# node2
cat > /opt/kubernetes/cfg/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 10.10.3.197
clusterCIDR: 10.244.0.0/16
EOF
Kube-proxy中的–cluster-dir|clusterCIDR指定的是集群中pod使用的网段,pod使用的网段和apiserver中指定的service的cluster ip|service-cluster-ip-range网段不是同一个网段
5.4.3 生成kube-proxy.kubeconfig文件
在master节点生成kube-proxy证书 在k8s-master节点操作
# 切换到存放证书目录
cd ~/TLS/k8s/
# 创建证书请求文件
cat > kube-proxy-csr.json << EOF
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
# 生成证书
root@localhost:~/TLS/k8s # ls
apiserver.csr apiserver-csr.json apiserver-key.pem apiserver.pem ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem kube-proxy-csr.json
root@localhost:~/TLS/k8s # cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
2020/08/27 17:10:00 [INFO] generate received request
2020/08/27 17:10:00 [INFO] received CSR
2020/08/27 17:10:00 [INFO] generating key: rsa-2048
2020/08/27 17:10:00 [INFO] encoded CSR
2020/08/27 17:10:00 [INFO] signed certificate with serial number 414616310621739470117285557327250607193567259357
2020/08/27 17:10:00 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
root@localhost:~/TLS/k8s # ll kube-proxy*pem
-rw------- 1 root root 1.7K Aug 27 17:10 kube-proxy-key.pem
-rw-r--r-- 1 root root 1.4K Aug 27 17:10 kube-proxy.pem
5.4.3 生成kube-proxy.kubeconfig文件
在master节点生成kube-proxy.kubeconfig文件
KUBE_APISERVER="https://10.10.3.139:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy \
--client-certificate=./kube-proxy.pem \
--client-key=./kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
# 保存到配置文件路径下
cp kube-proxy.kubeconfig /opt/kubernetes/cfg/
# 拷贝到node节点的/opt/kubernetes/cfg/下
scp -rp /opt/kubernetes/cfg/kube-proxy.kubeconfig 10.10.3.179:/opt/kubernetes/cfg
scp -rp /opt/kubernetes/cfg/kube-proxy.kubeconfig 10.10.3.197:/opt/kubernetes/cfg
5.4.4 配置systemd管理kube-proxy
cat > /usr/lib/systemd/system/kube-proxy.service << EOF
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-proxy.conf
ExecStart=/opt/kubernetes/bin/kube-proxy \$KUBE_PROXY_OPTS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
5.4.5 启动并设置开机启动
systemctl daemon-reload
systemctl restart kube-proxy
systemctl enable kube-proxy
[root@localhost cfg]# systemctl status -ll kube-proxy
● kube-proxy.service - Kubernetes Proxy
Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 17:14:55 CST; 7s ago
Main PID: 25853 (kube-proxy)
CGroup: /system.slice/kube-proxy.service
└─25853 /opt/kubernetes/bin/kube-proxy --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --config=/opt/kubernetes/cfg/kube-proxy-config.yml
Aug 27 17:14:55 localhost.localdomain systemd[1]: Started Kubernetes Proxy.
六、部署CNI网络
6.1 Kubernetes网络模型 (CNI)介绍
容器网络接口
kubernetes网络模型设计的基本要求:
- 一个pod一个ip
- 每个pod独立的ip,pod内所有容器共享网络(通一个ip)
- 所有容器都可以与所有其他容器通信
- 所有节点都可以与所有容器通信
目前支持的技术

最常用的是flannel和calico
- Flannel:适合百台以下服务器,小规模集群,使用操作简单
- calico:适合数百台以上,大规模集群
下载最新版地址:https://github.com/containernetworking/plugins/releases/tag/v0.8.7
wget https://github.com/containernetworking/plugins/releases/download/v0.8.7/cni-plugins-linux-amd64-v0.8.7.tgz
# 解压二进制包并移动到默认工作目录
mkdir -p /opt/cni/bin
tar zxvf cni-plugins-linux-amd64-v0.8.7.tgz -C /opt/cni/bin
# 在node节点创建cni目录
mkdir -p /opt/cni/bin
# 在master节点上拷贝到node节点的cni目录
scp -rp /opt/cni/bin/* k8s-node1:/opt/cni/bin/
scp -rp /opt/cni/bin/* k8s-node2:/opt/cni/bin/
6.3
6.3.1 部署CNI flannel 网络
所有的节点都需要安装flannel,其实只在k8s-master主节点安装即可,其余节点只有kubelet启动起来,会自动在master启动一个flannel
# 部署CNI网络(网络问题可以多尝试几次)
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 默认镜像地址无法访问外网,可以修改为docker hub镜像仓库
# harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
sed -ri "s#quay.io/coreos/flannel:.*-amd64#harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64#g" kube-flannel.yml
#添加"--iface=eth0"一句指定网卡
...
containers:
- name: kube-flannel
image: harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0
root@localhost:~/work/flannel # cat kube-flannel.yml
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- amd64
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm64
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-ppc64le
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- ppc64le
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-s390x
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- s390x
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
root@localhost:~/work/flannel # kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
root@localhost:~/work/flannel # kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-pnr7b 1/1 Running 0 29s
root@localhost:~/work/flannel # kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.10.3.179 Ready <none> 69s v1.18.8
6.3.1 部署CNI CoreDNS
6.4 授权apiserver访问kubelet
cat > apiserver-to-kubelet-rbac.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kube-apiserver-to-kubelet
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
- nodes/stats
- nodes/log
- nodes/spec
- nodes/metrics
- pods/log
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:kube-apiserver
namespace: ""
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kube-apiserver-to-kubelet
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: kubernetes
EOF
kubectl apply -f apiserver-to-kubelet-rbac.yaml
root@localhost:~/work/flannel # kubectl get ClusterRoleBinding | grep "system:kube-apiserve" 1
system:kube-apiserver ClusterRole/system:kube-apiserver-to-kubelet 97s
root@localhost:~/work/flannel # kubectl get ClusterRole | grep "system:kube-apiserver-to-kubelet"
system:kube-apiserver-to-kubelet 2020-08-27T09:33:30Z
7 新增加Worker Node
7.1. 拷贝已部署好的Node相关文件到新节点
在10.10.3.179节点将Worker Node涉及文件拷贝到新节点10.10.3.197或其他、
[root@localhost ~]# ls /opt/kubernetes/bin/
kubelet kube-proxy
[root@localhost ~]# ls /opt/kubernetes/cfg/
bootstrap.kubeconfig kubelet.conf kubelet-config.yml kubelet.kubeconfig kube-proxy.conf kube-proxy-config.yml kube-proxy.kubeconfig
[root@localhost ~]# ls /opt/kubernetes/ssl/
ca.pem kubelet-client-2020-08-27-17-05-24.pem kubelet-client-current.pem kubelet.crt kubelet.key
[root@localhost ~]# ls /opt/cni/bin/
bandwidth bridge dhcp firewall flannel host-device host-local ipvlan loopback macvlan portmap ptp sbr static tuning vlan
scp -r /opt/kubernetes 10.10.3.197:/opt
scp -r /usr/lib/systemd/system/{kubelet,kube-proxy}.service root@10.10.3.197:/usr/lib/systemd/system
scp -r /opt/cni root@10.10.3.197:/opt/
scp /opt/kubernetes/ssl/ca.pem root@10.10.3.197:/opt/kubernetes/ssl
7.2. 删除kubelet证书和kubeconfig文件
rm /opt/kubernetes/cfg/kubelet.kubeconfig
rm -f /opt/kubernetes/ssl/kubelet*
注:这几个文件是证书申请审批后自动生成的,每个Node不同,必须删除重新生成。
7.3. 修改主机名
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=10.10.3.197
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: 10.10.3.197
完整
[root@localhost cfg]# cat kubelet.conf
KUBELET_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/opt/kubernetes/logs \
--hostname-override=10.10.3.197 \
--network-plugin=cni \
--kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \
--config=/opt/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/opt/kubernetes/ssl \
--pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0"
[root@localhost cfg]# cat kube-proxy-config.yml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 10.10.3.197
clusterCIDR: 10.244.0.0/16
7.4. 启动并设置开机启动
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
systemctl start kube-proxy
systemctl enable kube-proxy
7.3. 查看状态
[root@localhost cfg]# systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2020-08-28 10:07:02 CST; 9min ago
Main PID: 11212 (kubelet)
Tasks: 16
Memory: 19.1M
CGroup: /system.slice/kubelet.service
└─11212 /opt/kubernetes/bin/kubelet --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --hostname-override=10.10.3.197 --network-plugin=cni --kubeconfig...
Aug 28 10:07:02 localhost.localdomain systemd[1]: Started Kubernetes Kubelet.
[root@localhost cfg]# systemctl status kube-proxy
● kube-proxy.service - Kubernetes Proxy
Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2020-08-28 10:03:58 CST; 12min ago
Main PID: 11030 (kube-proxy)
Tasks: 15
Memory: 14.7M
CGroup: /system.slice/kube-proxy.service
└─11030 /opt/kubernetes/bin/kube-proxy --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --config=/opt/kubernetes/cfg/kube-proxy-config.yml
Aug 28 10:03:58 localhost.localdomain systemd[1]: Started Kubernetes Proxy.
Aug 28 10:03:58 localhost.localdomain kube-proxy[11030]: E0828 10:03:58.403733 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:03:59 localhost.localdomain kube-proxy[11030]: E0828 10:03:59.469988 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:01 localhost.localdomain kube-proxy[11030]: E0828 10:04:01.496776 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:06 localhost.localdomain kube-proxy[11030]: E0828 10:04:06.095214 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:14 localhost.localdomain kube-proxy[11030]: E0828 10:04:14.473297 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:31 localhost.localdomain kube-proxy[11030]: E0828 10:04:31.189822 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
注意
这里可能会有个报错导致启动失败:error: failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "kubelet-bootstrap" cannot create certificatesigningrequests.certificates.k8s.io at the cluster scope
原因是:kubelet-bootstrap并没有权限创建证书。所以要创建这个用户的权限并绑定到这个角色上。
解决方法是在master上执行kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
7.6. 在Master上批准新Node kubelet证书申请
root@localhost:~ # kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-rAPlSXBF-YLeLIvxDoLB6VKPaTHJ96s7JS5svj8spQw 10m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
kubectl certificate approve node-csr-rAPlSXBF-YLeLIvxDoLB6VKPaTHJ96s7JS5svj8spQw
root@localhost:~ # kubectl get node
NAME STATUS ROLES AGE VERSION
10.10.3.179 Ready <none> 16h v1.18.8
10.10.3.197 Ready <none> 12s v1.18.8
root@localhost:~ # kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-rAPlSXBF-YLeLIvxDoLB6VKPaTHJ96s7JS5svj8spQw 11m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
八、用户授权
用户授权分服务账号(ServiceAccount)和普通意义上的用户(User),ServiceAccount是由K8S管理的,而User通常是在外部管理,K8S不存储用户列表——也就是说,添加/编辑/删除用户都是在外部进行,无需与K8S API交互,虽然K8S并不管理用户,但是在K8S接收API请求时,是可以认知到发出请求的用户的,实际上,所有对K8S的API请求都需要绑定身份信息(User或者ServiceAccount),这意味着,可以为User配置K8S集群中的请求权限
8.1 有什么区别?
最主要的区别上面已经说过了,即ServiceAccount是K8S内部资源,而User是独立于K8S之外的。从它们的本质可以看出:
- User通常是人来使用,而ServiceAccount是某个服务/资源/程序使用的
- User独立在K8S之外,也就是说User是可以作用于全局的,在任何命名空间都可被认知,并且需要在全局唯一
而ServiceAccount作为K8S内部的某种资源,是存在于某个命名空间之中的,在不同命名空间中的同名ServiceAccount被认为是不同的资源 - K8S不会管理User,所以User的创建/编辑/注销等,需要依赖外部的管理机制,K8S所能认知的只有一个用户名 ServiceAccount是由K8S管理的,创建等操作,都通过K8S完
这里说的添加用户指的是普通意义上的用户,即存在于集群外的用户,为k8s的使用者。 实际上叫做添加用户也不准确,用户早已存在,这里所做的只是使K8S能够认知此用户,并且控制此用户在集群内的权限
8.2 用户验证
尽管K8S认知用户靠的只是用户的名字,但是只需要一个名字就能请求K8S的API显然是不合理的,所以依然需要验证此用户的身份
在K8S中,有以下几种验证方式:
- X509客户端证书
客户端证书验证通过为API Server指定--client-ca-file=xxx选项启用,API Server通过此ca文件来验证API请求携带的客户端证书的有效性,一旦验证成功,API Server就会将客户端证书Subject里的CN属性作为此次请求的用户名 - 静态token文件
通过指定--token-auth-file=SOMEFILE选项来启用bearer token验证方式,引用的文件是一个包含了 token,用户名,用户ID 的csv文件 请求时,带上Authorization: Bearer 31ada4fd-adec-460c-809a-9e56ceb75269头信息即可通过bearer token验证 - 静态密码文件
通过指定--basic-auth-file=SOMEFILE选项启用密码验证,类似的,引用的文件时一个包含 密码,用户名,用户ID 的csv文件 请求时需要将Authorization头设置为Basic BASE64ENCODED(USER:PASSWORD)
这里只介绍客户端验证
8.3 为用户生成证书
假设我们操作的用户名为wubo(role,rolebinding)普通用户和admin(clusterRole,clusterRolebinding)管理员用户
openssl方式
- 首先需要为此用户创建一个私钥
openssl genrsa -out admin.key 2048;openssl genrsa -out wubo.key 2048 - 接着用此私钥创建一个csr(证书签名请求)文件,其中我们需要在subject里带上用户信息(CN为用户名,O为用户组)
openssl req -new -key admin.key -out admin.csr -subj "/CN=admin/O=system";openssl req -new -key wubo.key -out wubo.csr -subj "/CN=wubo/O=dev" - 其中/O参数可以出现多次,即可以有多个用户组
- 找到K8S集群(API Server)的CA证书文件,其位置取决于安装集群的方式,通常会在
/etc/kubernetes/pki/路径下,会有两个文件,一个是CA证书(ca.crt),一个是CA私钥(ca.key) - 通过集群的CA证书和之前创建的csr文件,来为用户颁发证书
openssl x509 -req -in admin.csr -CA path/to/ca.crt -CAkey path/to/ca.key -CAcreateserial -out admin.crt -days 365;openssl x509 -req -in wubo.csr -CA path/to/ca.crt -CAkey path/to/ca.key -CAcreateserial -out wubo.crt -days 365 -CA和-CAkey参数需要指定集群CA证书所在位置,-days参数指定此证书的过期时间,这里为365天 - 最后将证书(admin.crt|wubo.crt)和私钥(admin.key|wubo.key)保存起来,这两个文件将被用来验证API请求
或者cfssl 方式
- 首先需要为此用户创建一个请求文件,接着用此私钥创建一个csr(证书签名请求)文件,其中我们需要在subject里带上用户信息(CN为用户名,O为用户组)其中/O参数可以出现多次,即可以有多个用户组
root@localhost:~/TLS/k8s # cat admin-csr.json
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "admin",
"OU": "System"
}
]
} -
root@localhost:~/TLS/k8s # cat wubo-csr.json
{
"CN": "wubo",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "wubo",
"OU": "dev"
}
]
} 找到K8S集群(API Server)的CA证书文件,其位置取决于安装集群的方式,通常会在/root/TLS/k8s路径下,会有三个文件,一个是CA证书(ca.pem),一个是CA私钥(ca-key.pem)和配置文件ca-config.json - 通过集群的CA证书和之前创建的csr文件,来为用户颁发证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes wubo-csr.json | cfssljson -bare wubo -CA和-CAkey以及-config参数需要指定集群CA证书所在位置和配置文件 - 最后将证书(admin.pem)和私钥(admin-key.pem)保存起来,这两个文件将被用来验证API请求
root@localhost:~/TLS/k8s # ls admin*
admin.csr admin-csr.json admin-key.pem admin.pem
wubo.csr wubo-csr.json wubo-key.pem wubo.pem
8.4 创建用户,分user和serviceaccount两种,根据自己需求创建
参考https://blog.csdn.net/Michaelwubo/article/details/108319602
1)User:
创建namespaces
kubectl ns wubo
设置集群
kubectl config set-cluster kubernetes --server=https://10.10.3.139:6443 --certificate-authority=ca.pem --embed-certs=true --kubeconfig=wubo.kubeconfig
创建用户
kubectl config set-credentials wubo --client-certificate=wubo.pem --client-key=wubo-key.pem --embed-certs=true --kubeconfig=wubo.kubeconfig
设置上线文且指定namespace
kubectl config set-context wubo@kubernetes --cluster=kubernetes --user=wubo -n wubo --kubeconfig=wubo.kubeconfig
使用
kubectl config use-context wubo@kubernetes --kubeconfig=wubo.kubeconfig
或
2)serviceaccount
创建serviceaccount的wubo用户在指定的namespace下面
root@localhost:~/TLS/k8s # kubectl create serviceaccount wubo -n wubo 1
serviceaccount/wubo created
root@localhost:~/TLS/k8s # kubectl get serviceaccount -n wubo
NAME SECRETS AGE
default 1 34m
wubo 1 23s
root@localhost:~/TLS/k8s # kubectl describe serviceaccount wubo -n wubo
Name: wubo
Namespace: wubo
Labels: <none>
Annotations: <none>
Image pull secrets: <none>
Mountable secrets: wubo-token-4xhm6
Tokens: wubo-token-4xhm6
Events: <none>
root@localhost:~/TLS/k8s # kubectl describe secret wubo -n wubo
Name: wubo-token-4xhm6
Namespace: wubo
Labels: <none>
Annotations: kubernetes.io/service-account.name: wubo
kubernetes.io/service-account.uid: eae41b5e-2362-478f-8b6b-041c54cd5858
Type: kubernetes.io/service-account-token
Data
====
namespace: 4 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6Imtla2RhLS1SZ0NBUkVYTkY3eTZSWkE4UkNJVWJ2d0doS3NrYXpOOXlreWcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJ3dWJvIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Ind1Ym8tdG9rZW4tNHhobTYiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoid3VibyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImVhZTQxYjVlLTIzNjItNDc4Zi04YjZiLTA0MWM1NGNkNTg1OCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDp3dWJvOnd1Ym8ifQ.Gz985FdIBlZ-qeOMhdCnr34XtGNoDF_CA8p2Pv4n6re5qaOp9CmudDfvm7PohRRpuWZiTQZU3fgYhkaX9D5lpJ_8WktMGvApC4hXQR2v0_-AGVp_UcVQ2-ZR2-4UQbTRghaO8iwY2czr-2ULqGk4mtuGAnStc6TSRVeADwh1oRKAmL5f27UGEIQ1pFYqdJEiUCPGwcLTbhtZH-jBnVf-g8YCyhpgy1f8KrAmaFUdg8r6sCADn5JYuklBkiuQs2HcaitAHETnGR1d0uVOn48CO5LEjD_aFI7eBwaSGGetxmCq8rlAKQuPVnZTsqvaQNq0iVK4ogX8lOsTgfp94_8GuQ
ca.crt: 1359 bytes
8.5 为用户(user或serviceaccount)添加基于角色的访问控制(RBAC)
角色(Role)
在RBAC中,角色有两种——普通角色(Role)和集群角色(ClusterRole),ClusterRole是特殊的Role,相对于Role来说:
- Role属于某个命名空间,而ClusterRole属于整个集群,其中包括所有的命名空间
- ClusterRole能够授予集群范围的权限,比如node资源的管理,比如非资源类型的接口请求(如"/healthz"),比如可以请求全命名空间的资源(通过指定 --all-namespaces)
为用户(user)添加角色(注意先创建用户 )
(1)Role --> User -->Rolebinding
首先创造一个角色,普通角色跟namespace挂钩的
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
namespace: wubo
name: wubo-role
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
或者
kubectl create role wubo-role --verb="*" --resource="*" -n wubo
root@localhost:~/TLS/k8s # kubectl get role -n wubo 2
NAME CREATED AT
wubo-role 2020-08-31T03:37:23Z
这是在wubo命名空间内创建了一个wubo-role管理员角色,这里只是用wubo-role角色举例,实际上如果只是为了授予用户某命名空间管理员的权限的话,是不需要新建一个角色的,K8S已经内置了一个名为admin的ClusterRole。admin的角色是某个namespace的角色,而cluster-admin是所有namespaces的角色范围更大权限更大。use不但可以绑定普通role,user还可以绑定集群clusterole。
root@localhost:~/TLS/k8s # kubectl get clusterrole | grep admin
admin 2020-08-27T09:29:19Z
cluster-admin 2020-08-27T09:29:19Z
将角色和用户(user)绑定
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: wubo-binding
namespace: wubo
subjects:
- kind: User
name: wubo
apiGroup: ""
roleRef:
kind: Role
name: wubo-role
apiGroup: ""
或者
kubectl create rolebinding wubo-binding --role=wubo-role --user=wubo -n wubo
也可以指定为某一个namespace的集群角色
kubectl create rolebinding wubo-binding --clusterrole=wubo-role --user=wubo -n wubo
如yaml中所示,RoleBinding资源创建了一个 Role-User 之间的关系,roleRef节点指定此RoleBinding所引用的角色,subjects节点指定了此RoleBinding的受体,可以是User,也可以是前面说过的ServiceAccount,在这里只包含了名为 wubo的用户
root@localhost:~/TLS/k8s # kubectl get rolebinding -n wubo
NAME ROLE AGE
wubo-bind Role/wubo-role 6s
添加命名空间管理员的另一种方式
前面说过,K8S内置了一个名为admin的ClusterRole,所以实际上我们无需创建一个admin Role,直接对集群默认的admin ClusterRole添加RoleBinding就可以了
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: wubo-binding
namespace: wubo
subjects:
- kind: User
name: wubo
apiGroup: ""
roleRef:
kind: ClusterRole
name: admin
apiGroup: ""
kubectl create rolebinding wubo-binding --clusterrole=admin --user=wubo -n wubo
root@localhost:~/TLS/k8s # kubectl get rolebinding -n wubo
NAME ROLE AGE
wubo-admin-binding ClusterRole/admin 3s
使用RoleBinding去绑定ClusterRole:
如果有10个名称空间,每个名称空间都需要一个管理员,而这些管理员的权限都是一致的。那么此时需要去定义这样的管理员,使用RoleBinding就需要创建10个Role,这样显得很麻烦。为此当使用RoleBinding去绑定一个ClusterRole时,该User仅仅拥有对当前名称空间的集群操作权限,而不是拥有所有名称空间的权限,所以此时只需要创建一个ClusterRole代替掉10个Role就解决了以上的需求。
注意:--role和--clusterrole 还都是属于一个namespace的角色只是方便,减少role创建。每个namespace角色相同。
现在wubo/admin已经有权限访问了;
以上操作role和rolebinding都是只对当前名称空间生效;
这里虽然引用的是作为ClusterRole的admin角色,但是其权限被限制在RoleBinding admin-binding所处的命名空间,即wubo内
如果想要添加全命名空间或者说全集群的管理员,可以使用cluster-admin角色
到此为止,我们已经:
- 为wubo用户提供了基于X509证书的验证
- 为wubo命名空间创造了一个名为wubo-role的角色
- 为用户wubo和角色wubo-role创建了绑定关系
为kubectl配置用户
wubo/admin已经是管理员了,现在我们想要通过kubectl以admin/wubo的身份来操作集群,需要将wubo/admin的认证信息添加进kubectl的配置,即~/.kube/config中
这里假设config中已经配置好了k8s集群
- 通过命令
kubectl config set-credentials tom --client-certificate=path/to/wubo|admin.pem--client-key=path/to/wubo|admin-key.pem 将用户wubo/admin的验证信息添加进kubectl的配置 此命令会在配置中添加一个名为wubo/admin的用户 kubectl config set-context wubo/admin@aliyun --cluster=kubernests--namespace=wubo --user=wubo/admin 此命令添加了一个context配置——设定使用kubernests集群,默认使用wubo 命名空间,使用用户wubo/admin进行验证- 在命令中带上
kubectl --context=wubo/admin@kubernests... 参数即可指定kubectl使用之前添加的名为tom@aliyun的context操作集群 也可以通过命令 kubectl config use-context wubo/admin@kubernests来设置当前激活的context
Tips: 将认证信息嵌入kubectl的配置中
通过kubectl config set-credentials命令添加的用户,其默认使用的是引用证书文件路径的方式,表现在~/.kube/config中,就是:
users:
- name: wubo|admin
user:
client-certificate: path/to/wubo|admin.crt
client-key: path/to/wubo|admin.key
如果觉得这样总是带着两个证书文件不方便的话,可以将证书内容直接放到config文件里
- 将wubo|admin.pem/wubo|admin.key.pem的内容用BASE64编码
cat wubo|admin.pem| base64 --wrap=0
cat wubo|admin-key.pem | base64 --wrap=0 - 将获取的编码后的文本复制进config文件中
users:
- name: ich
user:
client-certificate-data: ...
client-key-data: ...
这样就不再需要证书和私钥文件了,当然这两个文件还是保存起来比较好
https://zhuanlan.zhihu.com/p/43237959
九、部署Dashboard和CoreDNS,metrics-server
9.1 部署Dashboard
https://github.com/kubernetes/dashboard
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.3/aio/deploy/recommended.yaml
默认Dashboard只能集群内部访问,修改Service为NodePort类型,暴露到外部
# 修改yaml文件
[root@k8s-master yaml]# vim recommended.yaml (32gg)
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
ports:
- port: 443
targetPort: 8443
nodePort: 30001 #添加类型
type: NodePort
selector:
k8s-app: kubernetes-dashboard
完整
root@localhost:~/work/dashboard # cat recommended.yaml
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: v1
kind: Namespace
metadata:
name: kubernetes-dashboard
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
nodePort: 30001
selector:
k8s-app: kubernetes-dashboard
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kubernetes-dashboard
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-csrf
namespace: kubernetes-dashboard
type: Opaque
data:
csrf: ""
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-key-holder
namespace: kubernetes-dashboard
type: Opaque
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-settings
namespace: kubernetes-dashboard
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
rules:
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster", "dashboard-metrics-scraper"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
verbs: ["get"]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
rules:
# Allow Metrics Scraper to get metrics from the Metrics server
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
image: harbor.superred.com/kubernetes/kubernetesui/dashboard:v2.0.3
imagePullPolicy: Always
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
- --namespace=kubernetes-dashboard
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
nodeSelector:
"kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 8000
targetPort: 8000
nodePort: 30002
selector:
k8s-app: dashboard-metrics-scraper
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: dashboard-metrics-scraper
template:
metadata:
labels:
k8s-app: dashboard-metrics-scraper
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'runtime/default'
spec:
containers:
- name: dashboard-metrics-scraper
image: harbor.superred.com/kubernetes/kubernetesui/metrics-scraper:v1.0.4
ports:
- containerPort: 8000
protocol: TCP
livenessProbe:
httpGet:
scheme: HTTP
path: /
port: 8000
initialDelaySeconds: 30
timeoutSeconds: 30
volumeMounts:
- mountPath: /tmp
name: tmp-volume
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
serviceAccountName: kubernetes-dashboard
nodeSelector:
"kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
volumes:
- name: tmp-volume
emptyDir: {}
# 生成dashboard
kubectl apply -f recommended.yaml
root@localhost:~/work/dashboard # kubectl apply -f recommended.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
root@localhost:~/work/dashboard # kubectl get pods,svc -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-775b89678b-2pnr9 1/1 Running 0 10s
pod/kubernetes-dashboard-66d54d4cd7-cfzmx 1/1 Running 0 10s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper NodePort 10.0.0.135 <none> 8000:30002/TCP 10s
service/kubernetes-dashboard NodePort 10.0.0.89 <none> 443:30001/TCP 10s
查看pod日志
kubectl get pods -n kube-system 1
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-pnr7b 1/1 Running 0 8m45s
root@localhost:~/work/dashboard # kubectl -n kube-system logs -f kube-flannel-ds-amd64-pnr7b
I0827 09:30:18.812198 1 main.go:518] Determining IP address of default interface
I0827 09:30:18.812898 1 main.go:531] Using interface with name eth0 and address 10.10.3.179
I0827 09:30:18.812930 1 main.go:548] Defaulting external address to interface address (10.10.3.179)
W0827 09:30:18.812969 1 client_config.go:517] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0827 09:30:19.008200 1 kube.go:119] Waiting 10m0s for node controller to sync
I0827 09:30:19.008490 1 kube.go:306] Starting kube subnet manager
I0827 09:30:20.009134 1 kube.go:126] Node controller sync successful
I0827 09:30:20.009271 1 main.go:246] Created subnet manager: Kubernetes Subnet Manager - 10.10.3.179
I0827 09:30:20.009279 1 main.go:249] Installing signal handlers
I0827 09:30:20.009465 1 main.go:390] Found network config - Backend type: vxlan
I0827 09:30:20.009613 1 vxlan.go:121] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
I0827 09:30:20.107538 1 main.go:355] Current network or subnet (10.244.0.0/16, 10.244.0.0/24) is not equal to previous one (0.0.0.0/0, 0.0.0.0/0), trying to recycle old iptables rules
I0827 09:30:20.115034 1 iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
I0827 09:30:20.209595 1 iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0827 09:30:20.211834 1 iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
I0827 09:30:20.213233 1 iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j MASQUERADE --random-fully
I0827 09:30:20.214655 1 main.go:305] Setting up masking rules
I0827 09:30:20.215906 1 main.go:313] Changing default FORWARD chain policy to ACCEPT
I0827 09:30:20.216037 1 main.go:321] Wrote subnet file to /run/flannel/subnet.env
I0827 09:30:20.216045 1 main.go:325] Running backend.
I0827 09:30:20.216061 1 main.go:343] Waiting for all goroutines to exit
I0827 09:30:20.216206 1 vxlan_network.go:60] watching for new subnet leases
I0827 09:30:20.406745 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0827 09:30:20.406798 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0827 09:30:20.409981 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0827 09:30:20.410022 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.410667 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0827 09:30:20.507534 1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/24 -j RETURN
I0827 09:30:20.508204 1 iptables.go:167] Deleting iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.510628 1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
I0827 09:30:20.606393 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0827 09:30:20.607101 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.610845 1 iptables.go:155] Adding iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.806505 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0827 09:30:20.908327 1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/24 -j RETURN
I0827 09:30:20.914368 1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
查看 kubernetes-dashboard在哪里节点
# 我们去相应的节点访问指定IP即可访问
访问地址:https://NodeIP:30001
root@localhost:/opt/kubernetes/cfg # kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-amd64-pnr7b 1/1 Running 0 14m 10.10.3.179 10.10.3.179 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-775b89678b-2pnr9 1/1 Running 0 6m6s 10.244.0.3 10.10.3.179 <none> <none>
kubernetes-dashboard kubernetes-dashboard-66d54d4cd7-cfzmx 1/1 Running 0 6m6s 10.244.0.2 10.10.3.179 <none> <none>
创建service account并绑定默认cluster-admin管理员集群角色:
root@localhost:/opt/kubernetes/cfg # kubectl create serviceaccount dashboard-admin -n kube-system 1
serviceaccount/dashboard-admin created
root@localhost:/opt/kubernetes/cfg # kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
root@localhost:/opt/kubernetes/cfg # kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
Name: dashboard-admin-token-jxr2g
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name: dashboard-admin
kubernetes.io/service-account.uid: 6c8e994b-c140-4b7c-8085-8955cb016f12
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1359 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6Imtla2RhLS1SZ0NBUkVYTkY3eTZSWkE4UkNJVWJ2d0doS3NrYXpOOXlreWcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tanhyMmciLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNmM4ZTk5NGItYzE0MC00YjdjLTgwODUtODk1NWNiMDE2ZjEyIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.DL-hoDCeLhB--edIVhYe021ev27KvD4LMlCrwD3gHdcD7Pc8L3iktZHb6DRE8f6DZ_p2_HshmZKRvNZVljd9lUSJzolw4DHh6_o5sibeeS1eADQqeg3m6fbUga9nuY3LlnYvSd0Q9QAZEHpiUWXUkpY9-S72BW2GLvJIYpd9xJrBUwDsxR5SRRWdv3iDBisSO8kC7usDdlgJvqvy7qqiu9QyLXwK3TgfXk4c-CFBuq5ZsmolqVT5MctF9B66L_9BLuNYeCiJUVd328Y-vgievIE9lN3RfG4fvxbM6KBkmwVgHA63RIi0f8ftxDsZ09MsKjOm0FVivmBDb9qTpfMzEQ
使用输出的token登录Dashboard。


使用输出的kubeconfig登录Dashboard
1.找到dashboard-admin-token-jxr2g
root@localhost:~ # kubectl get secret -n kube-system
NAME TYPE DATA AGE
dashboard-admin-token-jxr2g kubernetes.io/service-account-token 3 28m
default-token-28k9v kubernetes.io/service-account-token 3 41m
flannel-token-ss9zn kubernetes.io/service-account-token 3 40m
2 获取token
DASH_TOCKEN=$(kubectl get secret -n kube-system dashboard-admin-token-jxr2g -o jsonpath={.data.token}|base64 -d)
3. 设置集群
kubectl config set-cluster kubernetes --server=http://10.10.3.139:6443 --kubeconfig=/root/dashbord-admin.conf
4. 设置用户用token
kubectl config set-credentials dashboard-admin --token=$DASH_TOCKEN --kubeconfig=/root/dashbord-admin.conf
5. 上线文
kubectl config use-context dashboard-admin@kubernets --kubeconfig=/root/dashbord-admin.conf
生成的dashbord-admin.conf即可用于登录dashboard


9.2 部署CoreDNS
下载文件
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh
在官网下载https://github.com/coredns/deployment/tree/master/kubernetes 配置文件主要是deploy.sh和coredns.yam.sed,由于不是从kube-dns转到coredns,所以要注释掉kubectl相关操作,修改REVERSE_CIDRS、CLUSTER_DOMAIN(DNS_DOMAIN)、CLUSTER_DNS_IP等变量为实际值,具体命令./deploy.sh -s -r 10.254.0.0/16 -i 10.254.0.10 -d superred.com> coredns.yaml
CLUSTER_DNS_IP=$(kubectl get service --namespace kube-system kube-dns -o jsonpath="{.spec.clusterIP}")
# 下载coredns
git clone https://github.com/coredns/deployment.git
cd deployment/kubernetes/
# 修改部署脚本
vim deploy.sh
if [[ -z $CLUSTER_DNS_IP ]]; then
# Default IP to kube-dns IP
# CLUSTER_DNS_IP=$(kubectl get service --namespace kube-system kube-dns -o jsonpath="{.spec.clusterIP}")
CLUSTER_DNS_IP=10.10.0.2
# 执行部署
yum -y install epel-release jq
./deploy.sh | kubectl apply -f -
root@localhost:~/work/coredns # ./deploy.sh -s -r 10.0.0.0/16 -i 10.0.0.2 -d superred.com> coredns.yaml
root@localhost:~/work/coredns # diff coredns.yaml.sed coredns.yaml
55c55
< kubernetes CLUSTER_DOMAIN REVERSE_CIDRS {
---
> kubernetes superred.com 10.0.0.0/16 {
59c59
< forward . UPSTREAMNAMESERVER {
---
> forward . /etc/resolv.conf {
66c66
< }STUBDOMAINS
---
> }
181c181
< clusterIP: CLUSTER_DNS_IP
---
> clusterIP: 10.0.0.2
root@localhost:~/work/coredns # kubectl apply -f coredns.yaml
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
运行nginx示例
运行前环境
root@localhost:~/work/dashboard # kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.10.3.179 Ready <none> 49m v1.18.8
root@localhost:~/work/dashboard # kubectl get rc
No resources found in default namespace.
root@localhost:~/work/dashboard # kubectl get pods
master运行
kubectl run nginx-test --image=nginx --port=80
root@localhost:~/work/dashboard # kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-test 1/1 Running 0 21s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 52m
建nginx对外服务
root@localhost:~/work/dashboard # kubectl expose pod nginx-test --port=80 --target-port=80 --type=NodePort 1
service/nginx-test exposed
root@localhost:~/work/dashboard # kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-test 1/1 Running 0 94s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 53m
service/nginx-test NodePort 10.0.0.66 <none> 80:27022/TCP 3

或者
root@localhost:~/work/dashboard # kubectl expose pod nginx-test --port=80 --target-port=80 --type=LoadBalancer 1
service/nginx-test exposed
root@localhost:~/work/dashboard # kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 54m
nginx-test LoadBalancer 10.0.0.130 <pending> 80:62931/TCP 8s
root@localhost:~/work/dashboard # kubectl describe svc nginx-test
Name: nginx-test
Namespace: default
Labels: run=nginx-test
Annotations: <none>
Selector: run=nginx-test
Type: LoadBalancer
IP: 10.0.0.130
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 62931/TCP
Endpoints: 10.244.0.8:80
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

主要是--type=不一样 支持"ClusterIP", "ExternalName", "LoadBalancer", "NodePort"
如果有防火墙
在node1节点开放对应的NodePort
firewall-cmd --zone public --add-port 62931--permanent
firewall-cmd --reload
使用dnstools测试效果 ,注意:拿SVC服务来测试
root@localhost:~/work/coredns # kubectl run -it --rm --restart=Never --image=harbor.superred.com/kubernetes/infoblox/dnstools:latest dnstools 1
If you don't see a command prompt, try pressing enter.
dnstools# nslookup nginx-test
Server: 10.0.0.2
Address: 10.0.0.2#53
Name: nginx-test.default.svc.superred.com
Address: 10.0.0.130
dnstools# nslookup kubernetes
Server: 10.0.0.2
Address: 10.0.0.2#53
Name: kubernetes.default.svc.superred.com
Address: 10.0.0.1
dnstools#
9.3 部署metrics-server
https://yasongxu.gitbook.io/container-monitor/yi-.-kai-yuan-fang-an/di-1-zhang-cai-ji/heapster
https://www.cnblogs.com/centos-python/articles/10921991.html
https://www.cnblogs.com/linuxk/p/10582534.html
https://www.cnblogs.com/yuezhimi/p/11017155.html
https://www.cnblogs.com/cuishuai/p/9857120.html
监控系统:cAdvisor+Heapster+InfluxDB+Grafana,metrics-server,prometheus,k8s-prometheus-adapter
监控方案
| cAdvisor+Heapster+InfluxDB+Grafana | Y | 简单 | 容器监控 | | cAdvisor/exporter+Prometheus+Grafana | Y | 扩展性好 | 容器,应用,主机全方面监控 |
Prometheus+Grafana是监控告警解决方案里的后起之秀
通过各种exporter采集不同维度的监控指标,并通过Prometheus支持的数据格式暴露出来,Prometheus定期pull数据并用Grafana展示,异常情况使用AlertManager告警。

通过cadvisor采集容器、Pod相关的性能指标数据,并通过暴露的/metrics接口用prometheus抓取
通过prometheus-node-exporter采集主机的性能指标数据,并通过暴露的/metrics接口用prometheus抓取
应用侧自己采集容器中进程主动暴露的指标数据(暴露指标的功能由应用自己实现,并添加平台侧约定的annotation,平台侧负责根据annotation实现通过Prometheus的抓取)
通过kube-state-metrics采集k8s资源对象的状态指标数据,并通过暴露的/metrics接口用prometheus抓取
通过etcd、kubelet、kube-apiserver、kube-controller-manager、kube-scheduler自身暴露的/metrics获取节点上与k8s集群相关的一些特征指标数据。
实现思路
| 监控指标 | 具体实现 | 举例 | | Pod性能 | cAdvisor | 容器CPU,内存利用率 | | Node性能 | node-exporter | 节点CPU,内存利用率 | | K8S资源对象 | kube-state-metrics | Pod/Deployment/Service |
使用metric-server收集数据给k8s集群内使用,如kubectl,hpa,scheduler等
使用prometheus-operator部署prometheus,存储监控数据
使用kube-state-metrics收集k8s集群内资源对象数据
使用node_exporter收集集群中各节点的数据
使用prometheus收集apiserver,scheduler,controller-manager,kubelet组件数据
使用alertmanager实现监控报警
使用grafana实现数据可视化

Heapster和 metrics-server对比
Heapster
Heapster是Kubernetes旗下的一个项目,Heapster是一个收集者,并不是采集。收集流程
1.Heapster首先从apiserver获取集群中所有Node的信息。
2.通过这些Node上的kubelet获取有用数据,而kubelet本身的数据则是从cAdvisor得到。
3.所有获取到的数据都被推到Heapster配置的后端存储中,并还支持数据的可视化。
-
1.Heapster可以收集Node节点上的cAdvisor数据:CPU、内存、网络和磁盘 -
2.将每个Node上的cAdvisor的数据进行汇总 -
3.按照kubernetes的资源类型来集合资源,比如Pod、Namespace -
4.默认的metric数据聚合时间间隔是1分钟。还可以把数据导入到第三方工具ElasticSearch、InfluxDB、Kafka、Graphite -
5.展示:Grafana或Google Cloud Monitoring
现状
heapster已经被官方废弃(k8s 1.11版本中,HPA已经不再从hepaster获取数据) CPU内存、HPA指标: 改为metrics-server 基础监控:集成到prometheus中,kubelet将metric信息暴露成prometheus需要的格式,使用Prometheus Operator 事件监控:集成到https://github.com/heptiolabs/eventrouter
metrics-server:
从 v1.8 开始,资源使用情况的监控可以通过 Metrics API的形式获取,具体的组件为Metrics Server,用来替换之前的heapster,heapster从1.11开始逐渐被废弃。
Metrics-Server是集群核心监控数据的聚合器,从 Kubernetes1.8 开始,它作为一个 Deployment对象默认部署在由kube-up.sh脚本创建的集群中。
Metrics API
介绍Metrics-Server之前,必须要提一下Metrics API的概念
Metrics API相比于之前的监控采集方式(hepaster)是一种新的思路,官方希望核心指标的监控应该是稳定的,版本可控的,且可以直接被用户访问(例如通过使用 kubectl top 命令),或由集群中的控制器使用(如HPA),和其他的Kubernetes APIs一样。
官方废弃heapster项目,就是为了将核心资源监控作为一等公民对待,即像pod、service那样直接通过api-server或者client直接访问,不再是安装一个hepater来汇聚且由heapster单独管理。
假设每个pod和node我们收集10个指标,从k8s的1.6开始,支持5000节点,每个节点30个pod,假设采集粒度为1分钟一次,则:
因为k8s的api-server将所有的数据持久化到了etcd中,显然k8s本身不能处理这种频率的采集,而且这种监控数据变化快且都是临时数据,因此需要有一个组件单独处理他们,k8s版本只存放部分在内存中,于是metric-server的概念诞生了。
其实hepaster已经有暴露了api,但是用户和Kubernetes的其他组件必须通过master proxy的方式才能访问到,且heapster的接口不像api-server一样,有完整的鉴权以及client集成。这个api现在还在alpha阶段(18年8月),希望能到GA阶段。类api-server风格的写法:generic apiserver
有了Metrics Server组件,也采集到了该有的数据,也暴露了api,但因为api要统一,如何将请求到api-server的/apis/metrics请求转发给Metrics Server呢,解决方案就是:kube-aggregator,在k8s的1.7中已经完成,之前Metrics Server一直没有面世,就是耽误在了kube-aggregator这一步。
kube-aggregator(聚合api)主要提供:
-
Provide an API for registering API servers. -
Summarize discovery information from all the servers. -
Proxy client requests to individual servers.
详细设计文档:参考链接
metric api的使用:
-
Metrics API 只可以查询当前的度量数据,并不保存历史数据 -
Metrics API URI 为 /apis/metrics.k8s.io/,在 k8s.io/metrics 维护 -
必须部署 metrics-server 才能使用该 API,metrics-server 通过调用 Kubelet Summary API 获取数据
如:
http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes/<node-name>
http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/namespace/<namespace-name>/pods/<pod-name>
Metrics-Server
Metrics server定时从Kubelet的Summary API(类似/ap1/v1/nodes/nodename/stats/summary)采集指标信息,这些聚合过的数据将存储在内存中,且以metric-api的形式暴露出去。
Metrics server复用了api-server的库来实现自己的功能,比如鉴权、版本等,为了实现将数据存放在内存中,去掉了默认的etcd存储,引入了内存存储(即实现Storage interface)。因为存放在内存中,因此监控数据是没有持久化的,可以通过第三方存储来拓展,这个和heapster是一致的。

Metrics server出现后,新的Kubernetes 监控架构将变成上图的样子
-
核心流程(黑色部分):这是 Kubernetes正常工作所需要的核心度量,从 Kubelet、cAdvisor 等获取度量数据,再由metrics-server提供给 Dashboard、HPA 控制器等使用。 -
监控流程(蓝色部分):基于核心度量构建的监控流程,比如 Prometheus 可以从 metrics-server 获取核心度量,从其他数据源(如 Node Exporter 等)获取非核心度量,再基于它们构建监控告警系统。
官方地址:https://github.com/kubernetes-incubator/metrics-server
使用
如上文提到的,metric-server是扩展的apiserver,依赖于kube-aggregator,因此需要在apiserver中开启相关参数。
--requestheader-client-ca-file=/etc/kubernetes/certs/proxy-ca.crt
--proxy-client-cert-file=/etc/kubernetes/certs/proxy.crt
--proxy-client-key-file=/etc/kubernetes/certs/proxy.key
--requestheader-allowed-names=aggregator
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
安装成功后,访问地址api地址为:
root@localhost:~/work/metrics # curl --cacert /opt/kubernetes/ssl/ca.pem --key /opt/kubernetes/ssl/apiserver-key.pem --cert /opt/kubernetes/ssl/apiserver.pem https://10.10.3.139:6443/apis/metrics.k8s.io/v1beta1
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"singularName": "",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"name": "pods",
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
}#
Metrics Server的资源占用量会随着集群中的Pod数量的不断增长而不断上升,因此需要 addon-resizer垂直扩缩这个容器。addon-resizer依据集群中节点的数量线性地扩展Metrics Server,以保证其能够有能力提供完整的metrics API服务。具体参考:链接
1)、cAdvisor为谷歌开源的专门用于监控容器的服务,已经集成到了k8s里面(数据采集Agent)
只需在宿主机上部署cAdvisor容器,用户就可通过Web界面或REST服务访问当前节点和容器的性能数据(CPU、内存、网络、磁盘、文件系统等等),非常详细。
默认cAdvisor是将数据缓存在内存中,数据展示能力有限;它也提供不同的持久化存储后端支持,可以将监控数据保存、汇总到Google BigQuery、InfluxDB或者Redis之上。
新的Kubernetes版本里,cadvior功能已经被集成到了kubelet组件中
需要注意的是,cadvisor的web界面,只能看到单前物理机上容器的信息,其他机器是需要访问对应ip的url,数量少时,很有效果,当数量多时,比较麻烦,所以需要把cadvisor的数据进行汇总、展示,需要用到“cadvisor+influxdb+grafana”组合
grafanashig
Cadvisor监控,只需要 在kubelet命令中,启用Cadvisor,和配置相关信息 ,即可
Kubernetes有个出名的监控agent—cAdvisor。在每个kubernetes Node上都会运行cAdvisor,它会收集本机以及容器的监控数据(cpu,memory,filesystem,network,uptime)。在较新的版本中,K8S已经将cAdvisor功能集成到kubelet组件中。每个Node节点可以直接进行web访问。http://10.10.3.119:4194/ 直接访问
2)、Heapster是容器集群监控和性能分析工具,天然的支持Kubernetes和CoreOS。但是Heapster已经退休了!(数据收集)
Heapster是一个收集者,Heapster可以收集Node节点上的cAdvisor数据,将每个Node上的cAdvisor的数据进行汇总,还可以按照kubernetes的资源类型来集合资源,比如Pod、Namespace,可以分别获取它们的CPU、内存、网络和磁盘的metric。默认的metric数据聚合时间间隔是1分钟。还可以把数据导入到第三方工具(如InfluxDB)。
Kubernetes原生dashboard的监控图表信息来自heapster。在Horizontal Pod Autoscaling中也用到了Heapster,HPA将Heapster作为Resource Metrics API,向其获取metric。
3)、InfluxDB是一个开源的时序数据库。(数据存储)
4)、grafana是一个开源的数据展示工具。(数据展示)
https://github.com/kubernetes/kubernetes/tree/v1.18.8/cluster/addons/metrics-server
https://yasongxu.gitbook.io/container-monitor/qi-.-zui-jia-shi-jian/ye-nei-fang-an
https://www.bookstack.cn/read/kubernetes-handbook/appendix-docker-best-practice.md
Metrics API 只可以查询当前的度量数据,并不保存历史数据
Metrics server定时从Kubelet的Summary API(类似/ap1/v1/nodes/nodename/stats/summary)采集指标信息,这些聚合过的数据将存储在内存中,且以metric-api的形式暴露出去
K8S资源指标获取工具:metrics-server 自定义指标的监控工具:prometheus,k8s-prometheus-adapter
Metrics-Server是集群核心监控数据的聚合器,在k8s早期版本中,对资源的监控使用的是heapster的资源监控工具。但是从 Kubernetes 1.8 开始,Kubernetes 通过 Metrics API 获取资源使用指标,例如容器 CPU 和内存使用情况。这些度量指标可以由用户直接访问,例如通过使用kubectl top 命令,或者使用集群中的控制器,,因为k8s的api-server将所有的数据持久化到了etcd中,显然k8s本身不能处理这种频率的采集,而且这种监控数据变化快且都是临时数据,因此需要有一个组件单独处理他们.
prometheus:prometheus能够收集各种维度的资源指标,比如CPU利用率,网络连接的数量,网络报文的收发速率,包括进程的新建及回收速率等等,能够监控许许多多的指标,而这些指标K8S早期是不支持的,所以需要把prometheus能采集到的各种指标整合进k8s里,能让K8S根据这些指标来判断是否需要根据这些指标来进行pod的伸缩。
prometheus既作为监控系统来使用,也作为某些特殊的资源指标的提供者来使用。但是这些指标不是标准的K8S内建指标,称之为自定义指标,但是prometheus要想将监控采集到的数据作为指标来展示,则需要一个插件,这个插件叫k8s-prometheus-adapter,这些指标判断pod是否需要伸缩的基本标准,例如根据cpu的利用率、内存使用量去进行伸缩。
随着prometheus和k8s-prometheus-adapter的引入,新一代的k8s架构也就形成了。
K8S新一代架构
核心指标流水线:由kubelet、metrics-server以及由API server提供的api组成;CPU累积使用率、内存的实时使用率、pod的资源占用率及容器的磁盘占用率;
监控流水线:用于从系统收集各种指标数据并提供给终端用户、存储系统以及HPA,包含核心指标以及其他许多非核心指标。非核心指标本身不能被K8S所解析。所以需要k8s-prometheus-adapter将prometheus采集到的数据转化为k8s能理解的格式,为k8s所使用。
核心指标监控
之前使用的是heapster,但是1.12后就废弃了,之后使用的替代者是metrics-server;metrics-server是由用户开发的一个api server,用于服务资源指标,而不是服务pod,deploy的。metrics-server本身不是k8s的组成部分,是托管运行在k8s上的一个pod,那么如果想要用户在k8s上无缝的使用metrics-server提供的api服务,因此在新一代的架构中需要这样去组合它们。如图,使用一个聚合器去聚合k8s的api server与metrics-server,然后由群组/apis/metrics.k8s.io/v1beta1来获取。

之后如果用户还有其他的api server都可以整合进aggregator,由aggregator来提供服务,如图

Prometheus概述
除了前面的资源指标(如CPU、内存)以外,用户或管理员需要了解更多的指标数据,比如Kubernetes指标、容器指标、节点资源指标以及应用程序指标等等。自定义指标API允许请求任意的指标,其指标API的实现要指定相应的后端监视系统。而Prometheus是第一个开发了相应适配器的监控系统。这个适用于Prometheus的Kubernetes Customm Metrics Adapter是属于Github上的k8s-prometheus-adapter项目提供的。其原理图如下:
要知道的是prometheus本身就是一监控系统,也分为server端和agent端,server端从被监控主机获取数据,而agent端需要部署一个node_exporter,主要用于数据采集和暴露节点的数据,那么 在获取Pod级别或者是mysql等多种应用的数据,也是需要部署相关的exporter。我们可以通过PromQL的方式对数据进行查询,但是由于本身prometheus属于第三方的 解决方案,原生的k8s系统并不能对Prometheus的自定义指标进行解析,就需要借助于k8s-prometheus-adapter将这些指标数据查询接口转换为标准的Kubernetes自定义指标。
Prometheus是一个开源的服务监控系统和时序数据库,其提供了通用的数据模型和快捷数据采集、存储和查询接口。它的核心组件Prometheus服务器定期从静态配置的监控目标或者基于服务发现自动配置的目标中进行拉取数据,新拉取到啊的 数据大于配置的内存缓存区时,数据就会持久化到存储设备当中。Prometheus组件架构图如下: 
如上图,每个被监控的主机都可以通过专用的exporter程序提供输出监控数据的接口,并等待Prometheus服务器周期性的进行数据抓取。如果存在告警规则,则抓取到数据之后会根据规则进行计算,满足告警条件则会生成告警,并发送到Alertmanager完成告警的汇总和分发。当被监控的目标有主动推送数据的需求时,可以以Pushgateway组件进行接收并临时存储数据,然后等待Prometheus服务器完成数据的采集。
任何被监控的目标都需要事先纳入到监控系统中才能进行时序数据采集、存储、告警和展示,监控目标可以通过配置信息以静态形式指定,也可以让Prometheus通过服务发现的机制进行动态管理。下面是组件的一些解析:
- 监控代理程序:如node_exporter:收集主机的指标数据,如平均负载、CPU、内存、磁盘、网络等等多个维度的指标数据。
- kubelet(cAdvisor):收集容器指标数据,也是K8S的核心指标收集,每个容器的相关指标数据包括:CPU使用率、限额、文件系统读写限额、内存使用率和限额、网络报文发送、接收、丢弃速率等等。
- API Server:收集API Server的性能指标数据,包括控制队列的性能、请求速率和延迟时长等等
- etcd:收集etcd存储集群的相关指标数据
- kube-state-metrics:该组件可以派生出k8s相关的多个指标数据,主要是资源类型相关的计数器和元数据信息,包括制定类型的对象总数、资源限额、容器状态以及Pod资源标签系列等。
Prometheus 能够 直接 把 Kubernetes API Server 作为 服务 发现 系统 使用 进而 动态 发现 和 监控 集群 中的 所有 可被 监控 的 对象。 这里 需要 特别 说明 的 是, Pod 资源 需要 添加 下列 注解 信息 才 能被 Prometheus 系统 自动 发现 并 抓取 其 内建 的 指标 数据。
- 1) prometheus. io/ scrape: 用于 标识 是否 需要 被 采集 指标 数据, 布尔 型 值, true 或 false。
- 2) prometheus. io/ path: 抓取 指标 数据 时 使用 的 URL 路径, 一般 为/ metrics。
- 3) prometheus. io/ port: 抓取 指标 数据 时 使 用的 套 接 字 端口, 如 8080。
另外, 仅 期望 Prometheus 为 后端 生成 自定义 指标 时 仅 部署 Prometheus 服务器 即可, 它 甚至 也不 需要 数据 持久 功能。 但 若要 配置 完整 功能 的 监控 系统, 管理员 还需 要在 每个 主机 上 部署 node_ exporter、 按 需 部署 其他 特有 类型 的 exporter 以及 Alertmanager。
https://www.cnblogs.com/linuxk/p/10582534.html
https://yasongxu.gitbook.io/container-monitor/yi-.-kai-yuan-fang-an/di-1-zhang-cai-ji/custom-metrics
查看k8s默认的api-version,可以看到是没有metrics.k8s.io这个组的

当你部署好metrics-server后再查看api-versions就可以看到metrics.k8s.io这个组了
部署metrics-server 进到kubernetes项目下的cluster下的addons,找到对应的项目下载下来

metrics的证书
# 注意: "CN": "system:metrics-server" 一定是这个,因为后面授权时用到这个名称,否则会报禁止匿名访问
$ cat > metrics-server-csr.json <<EOF
{
"CN": "system:metrics-server",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "system"
}
]
}
EOF
生成 metrics-server 证书和私钥
生成证书
$ cfssl gencert -ca=/opt/kubernetes/ssl/ca.pem -ca-key=/opt/kubernetes/ssl/ca-key.pem -config=/opt/kubernetes/ssl/ca-config.json -profile=kubernetes metrics-server-csr.json | cfssljson -bare metrics-server
1、metrics-server-deployment.yaml metrics-server的command中加上 - --kubelet-insecure-tls 表示不验证客户端的证书,注释掉端口10255,注释后会使用10250,通过https通信
addon-resizer的command中写上具体的cpu、memory、extra-memory的值,注释掉minClusterSize={{ metrics_server_min_cluster_size }}
vim metrics-server-deployment.yaml


vim resource-reader.yaml

cat auth-delegator.yaml
root@localhost:~/work/metrics # cat auth-delegator.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
cat auth-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
cat metrics-apiservice.yaml
root@localhost:~/work/metrics # cat metrics-apiservice.yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
cat metrics-apiservice.yaml
root@localhost:~/work/metrics # cat metrics-apiservice.yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
root@localhost:~/work/metrics # cat metrics-server-deployment.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: ConfigMap
metadata:
name: metrics-server-config
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: EnsureExists
data:
NannyConfiguration: |-
apiVersion: nannyconfig/v1alpha1
kind: NannyConfiguration
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server-v0.3.6
namespace: kube-system
labels:
k8s-app: metrics-server
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
version: v0.3.6
spec:
selector:
matchLabels:
k8s-app: metrics-server
version: v0.3.6
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
version: v0.3.6
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
hostNetwork: true
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
nodeSelector:
kubernetes.io/os: linux
containers:
- name: metrics-server
image: harbor.superred.com/kubernetes/metrics-server-amd64:v0.3.6
command:
- /metrics-server
- --metric-resolution=30s
- --requestheader-allowed-names=aggregator
# These are needed for GKE, which doesn't support secure communication yet.
# Remove these lines for non-GKE clusters, and when GKE supports token-based auth.
#- --kubelet-port=10255
#- --deprecated-kubelet-completely-insecure=true
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
- --kubelet-insecure-tls
ports:
- containerPort: 443
name: https
protocol: TCP
- name: metrics-server-nanny
image: harbor.superred.com/kubernetes/addon-resizer:1.8.11
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 5m
memory: 50Mi
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: metrics-server-config-volume
mountPath: /etc/config
command:
- /pod_nanny
- --config-dir=/etc/config
#- --cpu={{ base_metrics_server_cpu }}
- --cpu=80m
- --extra-cpu=0.5m
#- --memory={{ base_metrics_server_memory }}
- --memory=80Mi
#- --extra-memory={{ metrics_server_memory_per_node }}Mi
- --extra-memory=8Mi
- --threshold=5
- --deployment=metrics-server-v0.3.6
- --container=metrics-server
- --poll-period=300000
- --estimator=exponential
# Specifies the smallest cluster (defined in number of nodes)
# resources will be scaled to.
#- --minClusterSize={{ metrics_server_min_cluster_size }}
volumes:
- name: metrics-server-config-volume
configMap:
name: metrics-server-config
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
cat metrics-server-service.yaml
root@localhost:~/work/metrics # cat metrics-server-service.yaml
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "Metrics-server"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: https
cat resource-reader.yaml
root@localhost:~/work/metrics # cat resource-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- namespaces
- nodes/stats
verbs:
- get
- list
- watch
- apiGroups:
- "apps"
resources:
- deployments
verbs:
- get
- list
- update
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
kubectl create -f .

开启聚合层api 1)修改master的kube-apiserver的启动脚本文件:
root@localhost:~/work/metrics # cat /opt/kubernetes/cfg/kube-apiserver.conf
KUBE_APISERVER_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/opt/kubernetes/logs \
--etcd-servers=https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379 \
--bind-address=10.10.3.139 \
--secure-port=6443 \
--advertise-address=10.10.3.139 \
#--advertise-address=10.244.1.1 \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/24 \
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
--authorization-mode=RBAC,Node \
--enable-bootstrap-token-auth=true \
--token-auth-file=/opt/kubernetes/cfg/token.csv \
--service-node-port-range=1024-65535 \
--kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem \
--kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem \
--tls-cert-file=/opt/kubernetes/ssl/apiserver.pem \
--tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem \
--client-ca-file=/opt/kubernetes/ssl/ca.pem \
--service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \
--etcd-cafile=/opt/etcd/ssl/ca.pem \
--etcd-certfile=/opt/etcd/ssl/etcd.pem \
--etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/opt/kubernetes/logs/k8s-audit.log \
--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \
#--requestheader-allowed-names=aggregator \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file=/opt/kubernetes/ssl/metrics-proxy.pem \
--proxy-client-key-file=/opt/kubernetes/ssl/metrics-proxy-key.pem \
--enable-aggregator-routing=true"
追加如下配置
注意:master没有安装kube-proxy 需要加上 --enable-aggregator-routing=true
#--requestheader-allowed-names的值和CN的名字一样
#--runtime-config=api/all=true \
--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \
#--requestheader-allowed-names=aggregator \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file=/opt/kubernetes/ssl/metrics-proxy.pem \
--proxy-client-key-file=/opt/kubernetes/ssl/metrics-proxy-key.pem \
--enable-aggregator-routing=true"
2)修改master的kube-controller-manager.service(我没有修改)
#vi /usr/lib/systemd/system/kube-controller-manager.service
--horizontal-pod-autoscaler-use-rest-clients=true
查看node及pod监控指标
如果权限问题就给最高权限先看看效果,可以根据情况自行给与权限大小。
root@localhost:~/work/metrics # kubectl top node 1
Error from server (Forbidden): nodes.metrics.k8s.io is forbidden: User "metrics-proxy" cannot list resource "nodes" in API group "metrics.k8s.io" at the cluster scope
root@localhost:~/work/metrics # kubectl create clusterrolebinding wubo:metrics-proxy --clusterrole=cluster-admin --user=metrics-proxy 1
clusterrolebinding.rbac.authorization.k8s.io/wubo:metrics-proxy created

至此,metrics-server部署结束。Prometheus
wget https://raw.githubusercontent.com/giantswarm/prometheus/master/manifests-all.yaml
root@localhost:~/work/jiankong/prometheus # cat manifests-all.yaml| grep image
image: harbor.superred.com/kubernetes/prometheus/alertmanager:v0.7.1 报警
image: harbor.superred.com/kubernetes/grafana/grafana:4.2.0 展示
image: harbor.superred.com/kubernetes/giantswarm/tiny-tools
image: harbor.superred.com/kubernetes/prom/prometheus:v1.7.0 抓取数据,最好和influxdb一起使用
image: harbor.superred.com/kubernetes/google_containers/kube-state-metrics:v0.5.0 k8s-master主节点一些信息
image: harbor.superred.com/kubernetes/dockermuenster/caddy:0.9.3
image: harbor.superred.com/kubernetes/prom/node-exporter:v0.14.0 物理节点信息
Every 2.0s: kubectl get pod,svc,ingress -o wide --all-namespaces Thu Sep 10 16:46:02 2020
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/coredns-85b4878f78-t4jvl 1/1 Running 0 13d 10.244.1.4 10.10.3.197 <none> <none>
kube-system pod/kube-flannel-ds-amd64-4b69t 1/1 Running 0 13d 10.10.3.197 10.10.3.197 <none> <none>
kube-system pod/kube-flannel-ds-amd64-pnr7b 1/1 Running 0 13d 10.10.3.179 10.10.3.179 <none> <none>
kube-system pod/metrics-server-v0.3.6-7444db4cbb-4zvn5 2/2 Running 0 26h 10.10.3.197 10.10.3.197 <none> <none>
kubernetes-dashboard pod/dashboard-metrics-scraper-775b89678b-x4qp6 1/1 Running 0 37h 10.244.1.27 10.10.3.197 <none> <none>
kubernetes-dashboard pod/kubernetes-dashboard-66d54d4cd7-6z64c 1/1 Running 0 37h 10.244.1.26 10.10.3.197 <none> <none>
monitoring pod/alertmanager-db5fbcfd5-ssq8c 1/1 Running 0 22h 10.244.0.33 10.10.3.179 <none> <none>
monitoring pod/grafana-core-d5fcd78b-fdw2h 1/1 Running 0 22h 10.244.0.38 10.10.3.179 <none> <none>
monitoring pod/grafana-import-dashboards-klqgp 0/1 Init:0/1 0 22h 10.244.0.34 10.10.3.179 <none> <none>
monitoring pod/kube-state-metrics-886f8f84d-l9sxb 1/1 Running 0 22h 10.244.1.37 10.10.3.197 <none> <none>
monitoring pod/node-directory-size-metrics-d8c7q 2/2 Running 0 22h 10.244.1.36 10.10.3.197 <none> <none>
monitoring pod/node-directory-size-metrics-zt5mq 2/2 Running 0 22h 10.244.0.36 10.10.3.179 <none> <none>
monitoring pod/prometheus-core-6b9687f87d-ng9tl 1/1 Running 1 22h 10.244.1.38 10.10.3.197 <none> <none>
monitoring pod/prometheus-node-exporter-4cs9h 1/1 Running 0 22h 10.10.3.197 10.10.3.197 <none> <none>
monitoring pod/prometheus-node-exporter-fnld5 1/1 Running 0 22h 10.10.3.179 10.10.3.179 <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 13d <none>
default service/nginx-test LoadBalancer 10.0.0.130 <pending> 80:62931/TCP 13d run=nginx-test
kube-system service/kube-dns ClusterIP 10.0.0.2 <none> 53/UDP,53/TCP,9153/TCP 13d k8s-app=kube-dns
kube-system service/metrics-server ClusterIP 10.0.0.109 <none> 443/TCP 26h k8s-app=metrics-server
kubernetes-dashboard service/dashboard-metrics-scraper NodePort 10.0.0.135 <none> 8000:30002/TCP 13d k8s-app=dashboard-metrics-scraper
kubernetes-dashboard service/kubernetes-dashboard NodePort 10.0.0.89 <none> 443:30001/TCP 13d k8s-app=kubernetes-dashboard
monitoring service/alertmanager NodePort 10.0.0.57 <none> 9093:49093/TCP 22h app=alertmanager
monitoring service/grafana NodePort 10.0.0.113 <none> 3000:43000/TCP 22h app=grafana,component=core
monitoring service/kube-state-metrics NodePort 10.0.0.66 <none> 8080:48080/TCP 22h app=kube-state-metrics
monitoring service/prometheus NodePort 10.0.0.10 <none> 9090:49090/TCP 22h app=prometheus,component=core
monitoring service/prometheus-node-exporter ClusterIP None <none> 9100/TCP 22h app=prometheus,component=node-exporter
https://www.cnblogs.com/cuishuai/p/9857120.html
https://blog.csdn.net/qq_37242520/article/details/107389230
https://blog.csdn.net/weixin_39974140/article/details/100664153
https://blog.csdn.net/ht9999i/article/details/107773164
https://www.cnblogs.com/you-men/p/13192086.html
https://www.cnblogs.com/lizhenliang/p/13025158.html#%E5%85%AD%E3%80%81%E9%83%A8%E7%BD%B2dashboard%E5%92%8Ccoredns
https://www.jianshu.com/p/834eb6ff2a72
https://aronligithub.github.io/2018/08/11/kubernetes%E7%B3%BB%E5%88%97%E4%BB%A5%E5%8F%8A%E8%BF%90%E7%BB%B4%E5%BC%80%E5%8F%91%E6%96%87%E7%AB%A0%E4%BB%8B%E7%BB%8D/
https://blog.51cto.com/13812615/2509816
https://blog.csdn.net/wangmiaoyan/article/details/102868728
=========================================================================
第二部分
高可用架构
https://blog.51cto.com/13812615/2509816
Kubernetes作为容器集群系统,通过健康检查+重启策略实现了Pod故障自我修复能力,通过调度算法 实现将Pod分布式部署,并保持预期副本数,根据Node失效状态自动在其他Node拉起Pod,实现了应用层的高可用性。
针对Kubernetes集群,高可用性还应包含以下两个层面的考虑:Etcd数据库的高可用性和Kubernetes Master组件的高可用性。 而Etcd我们已经采用3个节点组建集群实现高可用,本节将对Master节点高可 用进行说明和实施。
Master节点扮演着总控中心的角色,通过不断与工作节点上的Kubelet和kube-proxy进行通信来维护整 个集群的健康工作状态。如果Master节点故障,将无法使用kubectl工具或者API做任何集群管理。
Master节点主要有三个服务kube-apiserver、kube-controller-manager和kube-scheduler,其中kube- controller-manager和kube-scheduler组件自身通过选择机制已经实现了高可用,所以Master高可用主 要针对kube-apiserver组件,而该组件是以HTTP API提供服务,因此对他高可用与Web服务器类似,增 加负载均衡器对其负载均衡即可,并且可水平扩容。
多Master架构图:

扩容流程
新增主机:centos7-node7, 角色k8s-master2/3 就是上述的规划。
| k8s-master2 | 10.10.3.174 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd) | | k8s-master3 | 10.10.3.185 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd |
|