录一下在使用kubeadm搭建6节点k8s集群时遇到的一些问题吧,方便以后回溯。
k8s高可用集群搭建与功能测试
- 因为有了一台ESXi物理机,那就能开不少虚拟机做测试了。之前对K8S整体接触不多,更多的是配合其他同事一起排查问题。
- 这里就记录一下在使用kubeadm搭建6节点k8s集群时遇到的一些问题吧,方便以后回溯。
- 本次搭建是按照高可用配置进行,并且通过ceph-csi对接了ceph-rbd,配置了metallb负载均衡。中间也遇见了不少问题,学习到了很多。
- 很多镜像和yaml是google提供,需要更换到google源。
- 新版本k8s使用了systemd托管的cgroup v2,配置与之前有变化。
- 新版本k8s抛弃了dockerd,切换为containerd,使用crictl进行管理。
- 在对接ceph-csi和部署ingress的时候经常有异常,善用kubectl logs/describe分析报错。直接kubectl delete -f xxx.yaml,就可清理干净。如果卡住,直接到对应节点上手动清除pod即可。重新部署非常方便。
- calico和ingress使用了ipvs和nginx做转发,需要对Linux网络有一定的基础,理解起来会更快。
搭建环境说明
- 全部k8s节点操作系统采用
Debian GNU/Linux 12 (bookworm)
,内核6.1.0-18-amd64
。 - 存储节点简化部署为单节点,因为官方反馈目前最新版本ceph只测试过ubuntu,因此采用
Ubuntu 22.04.4 LTS
,内核5.15.0-101-generic
。 - 因为后续发现,docker image 占用系统盘比预想多,开始将
/home
单独分区,导致/
目录分区空间紧张。后期通过lvm在线扩容了/
分区。 - Ceph 部署步骤不再重复说明,可参考之前文章。
- 部署过程中经常遇见docker image 拉取不下来的情况,需替换官方yaml中的google image为
registry.aliyuncs.com/google_containers
主机名 | IP | CPU/内存 | 硬盘 |
---|---|---|---|
k8s-master1 | 192.168.3.151 | 4C8G | 2 * 50G |
k8s-master2 | 192.168.3.152 | 4C8G | 2 * 50G |
k8s-master3 | 192.168.3.153 | 4C8G | 2 * 50G |
k8s-node1 | 192.168.3.154 | 4C8G | 2 * 50G |
k8s-node2 | 192.168.3.155 | 4C8G | 2 * 50G |
k8s-node3 | 192.168.3.156 | 4C8G | 2 * 50G |
ceph1 | 192.168.3.156 | 4C8G | 1 * 50G + 1 * 35G + 3 * 500G |
keepalive-vip | 192.168.3.180/24 | - | - |
基础环境配置(全部6个节点执行)
- 参考链接:
- 系统基础配置:
- 全部节点配置静态IP,修改主机名,配置hosts文件走本地解析。
- 配置从
k8s-master1
到6个节点的免密登陆 - 关闭swap分区,k8s默认不允许使用交换分区,否则kubeadm初始化会失败
#临时关闭 swapoff -a #永久关闭:注释swap挂载,给swap开头加一下注释 sed -ri 's/.*swap.*/#&/' /etc/fstab #注意:如果是克隆的虚拟机,需要删除UUID一行
- 修改内核参数。
# 1、加载br_netfilter模块 modprobe br_netfilter # 2、验证模块是否加载成功 lsmod |grep br_netfilter # 3、修改内核参数 cat > /etc/sysctl.d/k8s.conf <<EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF # 4、使刚才修改的内核参数生效 sysctl -p /etc/sysctl.d/k8s.conf # 创建ipvs.modules文件 ~]# vim /etc/sysconfig/modules/ipvs.modules #!/bin/bash ipvs_modules="ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack" for kernel_module in ${ipvs_modules}; do /sbin/modinfo -F filename ${kernel_module} > /dev/null 2>&1 if [ 0 -eq 0 ]; then /sbin/modprobe ${kernel_module} fi done # 执行脚本 ~]# chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep ip_vs ip_vs_ftp 13079 0 nf_nat 26787 1 ip_vs_ftp ip_vs_sed 12519 0 ip_vs_nq 12516 0 ip_vs_sh 12688 0 ip_vs_dh 12688 0 ip_vs_lblcr 12922 0 ip_vs_lblc 12819 0 ip_vs_wrr 12697 0 ip_vs_rr 12600 0 ip_vs_wlc 12519 0 ip_vs_lc 12516 0 ip_vs 141092 22 ip_vs_dh,ip_vs_lc,ip_vs_nq,ip_vs_rr,ip_vs_sh,ip_vs_ftp,ip_vs_sed, ip_vs_wlc,ip_vs_wrr,ip_vs_lblcr,ip_vs_lblc nf_conntrack 133387 2 ip_vs,nf_nat libcrc32c 12644 4 xfs,ip_vs,nf_nat,nf_conntrack
- 将内核修改项加入开机启动项
# 配置rc.local启动项 cat > /etc/rc.local <<EOF #!/bin/sh -e # # rc.local # # This script is executed at the end of each multiuser runlevel. # Make sure that the script will "exit 0" on success or any other # value on error. # # In order to enable or disable this script just change the execution # bits. # # By default this script does nothing. # bash /root/bindip.sh #exit 0 modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf bash /etc/sysconfig/modules/ipvs.modules EOF chmod +x /etc/rc.local # 配置rc.local systemd托管 cat > /etc/systemd/system/rc-local.service <<EOF [Unit] Description=/etc/rc.local ConditionPathExists=/etc/rc.local [Service] Type=forking ExecStart=/etc/rc.local start TimeoutSec=0 StandardOutput=tty RemainAfterExit=yes SysVStartPriority=99 [Install] WantedBy=multi-user.target EOF # 启动开机运行 systemctl enable rc-local.service
- 安装基础包
先卸载非官方安装的docker,再重新安装官方版本,否则会有冲突问题apt install -y lvm2 wget net-tools nfs-common libnfs-utils lrzsz gcc make cmake libxml2-dev curl unzip sudo ntp libaio-dev wget vim ncurses-dev autoconf automake openssh-server socat ipvsadm conntrack ntpdate telnet rsync
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do apt-get remove $pkg; done sudo apt-get update sudo apt-get install ca-certificates curl gnupg sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg echo \ "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \ "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin systemctl enable docker --now
- 安装kubeadm,kubelet,kubectl 使用国内阿里云镜像,
apt-get install -y apt-transport-https software-properties-common curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - add-apt-repository "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" add-apt-repository "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" apt-get install -y kubelet kubeadm kubectl
部署nginx和keepalived(在master1 master2 执行即可)
-
部署API接口高可用
apt install nginx keepalived -y apt install libnginx-mod-stream/stable vim /etc/nginx/nginx.conf #具体配置文件见下文 nginx -t systemctl start nginx systemctl enable nginx vim /etc/keepalived/keepalived.conf #具体配置文件见下文 vim /etc/keepalived/check_nginx.sh #具体配置文件见下文 chmod +x /etc/keepalived/check_nginx.sh systemctl start keepalived.service systemctl enable keepalived.service
cat /etc/nginx/nginx.conf user www-data; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; include /usr/share/nginx/modules/*.conf; load_module /usr/lib/nginx/modules/ngx_stream_module.so; events { worker_connections 1024; } stream { log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent'; access_log /var/log/nginx/k8s-access.log main; upstream k8s-apiserver { server 192.168.3.151:6443; # k8s-master1 APISERVER IP:PORT server 192.168.3.152:6443; # k8s-master2 APISERVER IP:PORT } server { listen 16443; proxy_pass k8s-apiserver; } } http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; server { listen 80 default_server; server_name _; location / { } } }
# master 节点配置 cat /etc/keepalived/keepalived.conf global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_MASTER } vrrp_script check_nginx { script "/etc/keepalived/check_nginx.sh" } vrrp_instance VI_1 { state MASTER interface ens192 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.3.180/24 } track_script { check_nginx } } # backup 节点配置 cat /etc/keepalived/keepalived.conf global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_BACKUP } vrrp_script check_nginx { script "/etc/keepalived/check_nginx.sh" } vrrp_instance VI_1 { state BACKUP interface ens192 virtual_router_id 51 priority 90 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.3.180/24 } track_script { check_nginx } }
# master backup 节点配置 cat /etc/keepalived/check_nginx.sh #!/bin/bash count=$(ps -ef |grep nginx | grep sbin | egrep -cv "grep|$$") if [ "$count" -eq 0 ];then systemctl stop keepalived fi
kubeadmin初始化k8s集群
-
创建kubeadm-config.yaml文件
root@k8s-master1:~# cat kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration kubernetesVersion: v1.28.2 controlPlaneEndpoint: 192.168.3.180:16443 imageRepository: registry.aliyuncs.com/google_containers apiServer: certSANs: - 192.168.3.151 - 192.168.3.152 - 192.168.3.153 - 192.168.3.154 - 192.168.3.155 - 192.168.3.156 - 192.168.3.180 networking: podSubnet: 10.244.0.0/16 serviceSubnet: 10.10.0.0/16 --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs
-
修改containerd配置。因为k8s新版本不再使用docker进行pod管理,并且新版本cgroup改为systemd方式托管。
containerd config default > /etc/containerd/config.toml vim /etc/containerd/config.toml systemctl restart containerd systemctl enable containerd
# /etc/containerd/config.toml 有两处需要修改 # 将仓库地址修改成 registry.cn-hangzhou.aliyuncs.com/converts/pause:3.6 sandbox_image = "registry.cn-hangzhou.aliyuncs.com/converts/pause:3.6" # 将SystemdCgroup 配置修改 [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true
-
提前先手动将基础image从国内镜像源拉下来,重命名tag。google img无法直接访问
docker pull registry.aliyuncs.com/google_containers/coredns:v1.10.1 docker pull registry.aliyuncs.com/google_containers/etcd:3.5.9-0 docker pull registry.aliyuncs.com/google_containers/coredns:v1.10.1 docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 docker pull registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2 docker pull registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2 docker pull registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2 docker pull registry.aliyuncs.com/google_containers/pause:3.9 docker tag registry.aliyuncs.com/google_containers/coredns:v1.10.1 k8s.gcr.io/coredns:v1.10.1 docker tag registry.aliyuncs.com/google_containers/etcd:3.5.9-0 k8s.gcr.io/etcd:3.5.9-0 docker tag registry.aliyuncs.com/google_containers/coredns:v1.10.1 k8s.gcr.io/coredns:v1.10.1 docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 k8s.gcr.io/kube-apiserver:v1.28.2 docker tag registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2 k8s.gcr.io/kube-controller-manager:v1.28.2 docker tag registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2 k8s.gcr.io/kube-proxy:v1.28.2 docker tag registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2 k8s.gcr.io/kube-scheduler:v1.28.2 docker tag registry.aliyuncs.com/google_containers/pause:3.9 k8s.gcr.io/pause:3.9 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/coredns:v1.10.1 k8s.gcr.io/coredns:v1.10.1 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/etcd:3.5.9-0 k8s.gcr.io/etcd:3.5.9-0 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/coredns:v1.10.1 k8s.gcr.io/coredns:v1.10.1 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 k8s.gcr.io/kube-apiserver:v1.28.2 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.2 k8s.gcr.io/kube-controller-manager:v1.28.2 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/kube-proxy:v1.28.2 k8s.gcr.io/kube-proxy:v1.28.2 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.2 k8s.gcr.io/kube-scheduler:v1.28.2 ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/pause:3.9 k8s.gcr.io/pause:3.9
-
初始化master节点
# master1 节点执行 kubeadm init --config kubeadm-config.yaml # 配置kubectl的配置文件config mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config kubectl get nodes # 测试能否正常访问 # 扩容master2 master3 节点,复制证书 cd /root && mkdir -p /etc/kubernetes/pki/etcd && mkdir -p ~/.kube/ scp /etc/kubernetes/pki/ca.crt k8s-master2:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/ca.key k8s-master2:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/sa.key k8s-master2:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/sa.pub k8s-master2:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/front-proxy-ca.crt k8s-master2:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/front-proxy-ca.key k8s-master2:/etc/kubernetes/pki/ scp /etc/kubernetes/pki/etcd/ca.crt k8s-master2:/etc/kubernetes/pki/etcd/ scp /etc/kubernetes/pki/etcd/ca.key k8s-master2:/etc/kubernetes/pki/etcd/ # master1 生成扩容token kubeadm token create --print-join-command kubeadm join 192.168.3.180:16443 --token 8eah8v.2i66ruzqiw5d1sh5 --discovery-token-ca-cert-hash sha256:xxx --control-plane # 如果节点部署或者扩容失败,执行以下步骤清理 kubeadm reset iptables -F iptables -F -t nat ipvsadm --clear rm .kube -r
-
初始化worker节点
# 创建worker节点 kubeadm token create --print-join-command kubeadm join 192.168.3.180:16443 --token 8eah8v.2i66ruzqiw5d1sh5 --discovery-token-ca-cert-hash sha256:xxx # 检查状态并打标签 kubectl get nodes # 节点状态是NotReady状态符合预期,需要等到安装完网络插件。 kubectl label node k8s-node1 node-role.kubernetes.io/worker=worker kubectl get nodes
-
安装cni calico
curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml -O kubectl apply -f calico.yaml kubectl get pods -A
k8s 通过ceph-csi对接rbd
-
k8s可以通过keyring或者secret直接挂在rbd,该方式可直接挂载。灵活性较差,配置也比较简单,不做详细记录。
参考文档:Ceph使用—对接K8s集群使用案例 -
试了很多不同的部署文档,最新版的ceph-csi与之前的有不小差别,建议直接去看github官方文档。
-
参考文档:
-
首先编译cephcsi插件的镜像,但其实可以直接docker pull下来。编译过程中又是因为go环境默认访问google源导致报错,需要改Dockerfile
# git clone cephcsi代码 git clone --depth 1 --branch v3.10.2 https://gitclone.com/github.com/ceph/ceph-csi make image-cephcsi
编译过程中会调用
deploy/cephcsi/image/Dockerfile
文件,在容器内进行go make。
阅读该文件,在进行初始化环境准备的时候,会执行以下shell命令,进行go环境安装RUN source /build.env && \ ( test -n "${GO_ARCH}" && exit 0; echo -e "\n\nMissing GO_ARCH argument for building image, install Golang or run: make image-cephcsi GOARCH=amd64\n\n"; exit 1 ) && \ mkdir -p ${GOROOT} && \ curl https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${GO_ARCH}.tar.gz | tar xzf - -C ${GOROOT} --strip-components=1
其中
storage.googleapis.com
网站无法访问,后续的go version
测试命令会执行失败。
尝试将该网址直接替换为国内镜像地址,依然报错。最后直接修改shell脚本,使用最简单命令下载解压后,可正常通过。# 增加 wget 工具安装 RUN dnf -y install wget ... # 注释掉一下内容,增加修改为直接wget 后 tar解压 #RUN source /build.env && \ # ( test -n "${GO_ARCH}" && exit 0; echo -e "\n\nMissing GO_ARCH argument for building image, install Golang or run: make image-cephcsi GOARCH=amd64\n\n"; exit 1 ) && \ # mkdir -p ${GOROOT} && \ # curl https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-${GO_ARCH}.tar.gz | tar xzf - -C ${GOROOT} --strip-components=1 RUN source /build.env RUN mkdir -p ${GOROOT} RUN wget "https://studygolang.com/dl/golang/go${GOLANG_VERSION}.linux-${GO_ARCH}.tar.gz" RUN tar xzf go1.20.4.linux-amd64.tar.gz -C ${GOROOT} --strip-components=1
或者说直接拉取镜像
docker pull quay.io/cephcsi/cephcsi:v3.10.2-amd64 docker tag quay.io/cephcsi/cephcsi:v3.10.2-amd64 quay.io/cephcsi/cephcsi:v3.10.2
-
按照官方文档进行部署,替换yaml文件中的image地址
# Create CSIDriver object: kubectl create -f csidriver.yaml # Deploy RBACs for sidecar containers and node plugins: kubectl create -f csi-provisioner-rbac.yaml kubectl create -f csi-nodeplugin-rbac.yaml # Deploy Ceph configuration ConfigMap for CSI pods: # 该文件需要具体情况进行修改 kubectl create -f csi-config-map.yaml # Deploy Ceph configuration ConfigMap for CSI pods: kubectl create -f ../../ceph-conf.yaml # 该文件需要具体情况进行修改
cat csi-config-map.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: "ceph-csi-config" data: config.json: |- [ { "clusterID": "3e07d43f-688e-4284-bfb7-3e6ed5d3b77b", "monitors": [ "192.168.3.160:6789" ] } ] cat ../../ceph-conf.yaml --- apiVersion: v1 kind: ConfigMap data: ceph.conf: | [global] fsid = 3e07d43f-688e-4284-bfb7-3e6ed5d3b77b mon initial members = ceph1 mon host = 192.168.3.160 public network = 192.168.3.1/24 cluster network = 192.168.3.1/24 auth cluster required = cephx auth service required = cephx auth client required = cephx # keyring is a required key and its value should be empty keyring: | metadata: name: ceph-config
需要将yaml文件里的image替换为registry.aliyuncs.com/google_containers镜像源。 将
csi-rbdplugin-provisioner.yaml
和csi-rbdplugin.yaml
中关于kms
内容注释,否则会因为没有配置报错。root@k8s-master1:~/ceph-csi/deploy/rbd/kubernetes# grep -C 3 kms csi-rbdplugin-provisioner.yaml readOnly: true - name: ceph-csi-config mountPath: /etc/ceph-csi-config/ # - name: ceph-csi-encryption-kms-config # mountPath: /etc/ceph-csi-encryption-kms-config/ - name: keys-tmp-dir mountPath: /tmp/csi/keys - name: ceph-config -- - name: ceph-csi-config configMap: name: ceph-csi-config # - name: ceph-csi-encryption-kms-config # configMap: # name: ceph-csi-encryption-kms-config - name: keys-tmp-dir emptyDir: { medium: "Memory" -- - serviceAccountToken: path: oidc-token expirationSeconds: 3600 audience: ceph-csi-kms root@k8s-master1:~/ceph-csi/deploy/rbd/kubernetes# grep -C 3 kms csi-rbdplugin.yaml readOnly: true - name: ceph-csi-config mountPath: /etc/ceph-csi-config/ # - name: ceph-csi-encryption-kms-config # mountPath: /etc/ceph-csi-encryption-kms-config/ - name: plugin-dir mountPath: /var/lib/kubelet/plugins mountPropagation: "Bidirectional" -- - name: ceph-csi-config configMap: name: ceph-csi-config # - name: ceph-csi-encryption-kms-config # configMap: # name: ceph-csi-encryption-kms-config - name: keys-tmp-dir emptyDir: { medium: "Memory" -- - serviceAccountToken: path: oidc-token expirationSeconds: 3600 audience: ceph-csi-kms ---
# Deploy CSI sidecar containers: kubectl create -f csi-rbdplugin-provisioner.yaml # Deploy RBD CSI driver: kubectl create -f csi-rbdplugin.yaml # check root@k8s-master1:~# kubectl get all NAME READY STATUS RESTARTS AGE pod/csi-rbdplugin-gkr6x 3/3 Running 37 (106m ago) 15d pod/csi-rbdplugin-kv8mh 3/3 Running 33 (106m ago) 15d pod/csi-rbdplugin-m8wq5 3/3 Running 30 (106m ago) 15d pod/csi-rbdplugin-provisioner-7668ddb98-cvww6 7/7 Running 75 (106m ago) 15d pod/csi-rbdplugin-provisioner-7668ddb98-rdrjn 7/7 Running 73 (106m ago) 15d pod/csi-rbdplugin-provisioner-7668ddb98-s9jqm 7/7 Running 72 (106m ago) 15d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/csi-metrics-rbdplugin ClusterIP 10.10.166.217 <none> 8080/TCP 15d service/csi-rbdplugin-provisioner ClusterIP 10.10.224.247 <none> 8080/TCP 15d service/kubernetes ClusterIP 10.10.0.1 <none> 443/TCP 20d NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/csi-rbdplugin 3 3 3 3 3 <none> 15d NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/csi-rbdplugin-provisioner 3/3 3 3 15d NAME DESIRED CURRENT READY AGE replicaset.apps/csi-rbdplugin-provisioner-7668ddb98 3 3 3 15d
-
对接ceph集群,配置secret和storageclass。
ceph创建keyring和rbd pool的过程不再重复说明。# 新建k8s 的secret root@k8s-master1:~# cat csi-rbd-secret.yaml apiVersion: v1 kind: Secret metadata: name: csi-rbd-secret namespace: default stringData: userID: k8s-rbd userKey: AQCtwxxxxxYHLA== $ kubectl apply -f csi-rbd-secret.yaml # 创建storageclass --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: csi-rbd-sc provisioner: rbd.csi.ceph.com parameters: clusterID: 3e07d43f-xxxx-3e6ed5d3b77b pool: rbd imageFeatures: layering csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret csi.storage.k8s.io/provisioner-secret-namespace: default csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret csi.storage.k8s.io/controller-expand-secret-namespace: default csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret csi.storage.k8s.io/node-stage-secret-namespace: default csi.storage.k8s.io/fstype: ext4 reclaimPolicy: Delete allowVolumeExpansion: true mountOptions: - discard
-
创建测试pvc和pod,直接使用官方example
cd ceph-csi/examples/rbd kubectl create -f pvc.yaml kubectl create -f pod.yaml
k8s 部署metallb + ingress 网络服务
-
k8s有很多中实现ingress的ingress控制器,可直接选用最常用的ingress-nginx。
-
因为ingress服务需要对接负载均衡,k8s本身不提供该功能。公有云厂商一般单独配有LB,本测试环境选用metallb来实现LB功能。
-
如果不部署LB,直接部署ingress,ingress的
EXTERNAL-IP
会一直处在pending
状态。 -
之前部署,因为
ip-pool.yaml
中遗漏L2Advertisement
配置,导致metallb不响应外部ARP请求,集群外无法访问。 -
参考资料:
-
修改ARP模式,安装metallb。
kubectl edit configmap -n kube-system kube-proxy
apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: "ipvs" ipvs: strictARP: true
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.4/config/manifests/metallb-native.yaml
-
配置IP地址池并广播通告集群
#配置ip地址池 apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: ip-pool namespace: metallb-system spec: addresses: #IP地址池特别注意此地址区间不能与集群地址空间重合 - 192.168.3.237-192.168.3.240 --- ## 通告消息 apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: l2adver namespace: metallb-system
kubectl apply -f ip-pool.yaml
-
安装ingress,修改配置文件
externalTrafficPolicy: Local → externalTrafficPolicy: Cluster
ingress-nginx-deploy.yamlapiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.7.1 name: ingress-nginx-controller namespace: ingress-nginx spec: externalTrafficPolicy: Cluster
kubectl apply -f ingress-nginx-deploy.yaml
-
检查
root@k8s-master1:~# kubectl get svc ingress-nginx-controller -n ingress-nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx-controller LoadBalancer 10.10.19.123 192.168.3.237 80:32362/TCP,443:31935/TCP 6d23h
-
部署测试用例
root@k8s-master1:~# cat ingress-web.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-host-bar spec: ingressClassName: nginx rules: - host: "hello.flyfish.com" http: paths: - pathType: Prefix path: "/" backend: service: name: hello-server port: number: 8000 - host: "demo.flyfish.com" http: paths: - pathType: Prefix path: "/" backend: service: name: nginx-demo port: number: 8000 root@k8s-master1:~# cat ingress-test.yaml apiVersion: apps/v1 kind: Deployment metadata: name: hello-server spec: replicas: 2 selector: matchLabels: app: hello-server template: metadata: labels: app: hello-server spec: containers: - name: hello-server image: registry.cn-hangzhou.aliyuncs.com/lfy_k8s_images/hello-server ports: - containerPort: 9000 --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx-demo name: nginx-demo spec: replicas: 2 selector: matchLabels: app: nginx-demo template: metadata: labels: app: nginx-demo spec: containers: - image: nginx name: nginx --- apiVersion: v1 kind: Service metadata: labels: app: nginx-demo name: nginx-demo spec: selector: app: nginx-demo ports: - port: 8000 protocol: TCP targetPort: 80 --- apiVersion: v1 kind: Service metadata: labels: app: hello-server name: hello-server spec: selector: app: hello-server ports: - port: 8000 protocol: TCP targetPort: 9000
# 创建测试用pod和service kubectl apply -f ingress-test.yaml # 创建测试用ingress 访问 kubectl apply -f ingress-web.yaml # 集群外部机器配置hosts解析记录 192.168.3.237 hello.flyfish.com demo.flyfish.com # curl访问测试 curl http://hello.flyfish.com curl http://demo.flyfish.com