TOC
VirtualBox启动虚拟机
K8S最低配置要求:4GB+内存/2核+CPU
网络准备
启动两个网卡,一个桥接网卡用于宿主机与虚拟机互访,一个**网络地址转换(NAT)**用于访问外网。ubuntu启动后,配置/etc/netplan/00-installer-config.yaml
将两个网卡:
# This is the network config written by 'subiquity'
network:
ethernets:
enp0s3: # 通过ip a可以看到两张网卡,这里根据实际情况来
dhcp4: no
addresses: [192.168.56.101/24] # 手动分配一个IP给节点,需要跟网卡的网段配合
enp0s8: # 通过ip a可以看到两张网卡,这里根据实际情况来
dhcp4: no
addresses: [10.0.2.101/24] # 手动分配一个IP给节点,需要跟网卡的网段配合
gateway4: 10.0.2.1 # 指定此网卡的网关,这一步很重要,否则数据包无法路由转发出去
nameservers:
addresses: [8.8.8.8, 8.8.4.4] # 为了访问外部域名, 顺便配置DNS
version: 2
修改完成后,执行sudo netplan apply
生效
Docker安装
有许多方式可以安装Docker,细节参照官方文档,下面是通过脚本一键安装的流程:
# 自动安装
curl -fsSL get.docker.com -o get-docker.sh
# --version 指定版本
sudo sh get-docker.sh --mirror Aliyun --version 19.03
# 启动docker服务
sudo systemctl enable docker
sudo systemctl start docker
K8S 安装
组件安装
master和所有worker节点均需要安装kubelet/kubeadm。
详细的安装流程参照官方文档,下面抽取关键步骤:
-
更新
apt
包索引并安装使用 Kubernetesapt
仓库所需要的包:sudo apt-get update # apt-transport-https 可能是一个虚拟包(dummy package);如果是的话,你可以跳过安装这个包 sudo apt-get install -y apt-transport-https ca-certificates curl gpg
-
添加 Kubernetes APT 存储库的 GPG 密钥:
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /usr/share/keyrings/kubernetes-archive-keyring.gpg # 国内添加信任证书 curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
-
添加 Kubernetes
apt
仓库:echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list > /dev/null # 添加国内源地址 add-apt-repository "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main"
-
更新
apt
包索引,安装 kubelet、kubeadm 和 kubectl,并锁定其版本:sudo apt-get update #这里选择安装1.24之前的旧版本, 可以少装cri-docker #sudo apt-get install -y kubelet=1.22.0-00 kubeadm=1.22.0-00 kubectl=1.22.0-00 sudo apt-get install -y kubelet=1.11.0-00 kubeadm=1.11.0-00 kubectl=1.11.0-00 kubernetes-cni=0.6.0-00 # 去掉指定版本则需要配合安装cri-docker #sudo apt-get install -y kubelet kubeadm kubectl # 锁定版本 #sudo apt-mark hold kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl kubernetes-cni
搭建集群
初始化master节点
sudo kubeadm init \
--apiserver-advertise-address=172.17.0.7 \
--pod-network-cidr=10.244.0.0/16
# 返回以下信息,保存以供新节点加入集群使用
kubeadm join 192.168.56.101:6443 --token 1iwx1m.r80xwy5mtil99pcr --discovery-token-ca-cert-hash sha256:1baf8447f18945b204c43cba9d36d17e2c7bb57cf5bf0b4c03f33fa20f1f4b78
# ubuntu003
kubeadm join 192.168.57.101:6443 --token 51n2um.mf2leydcsfc8q7ey --discovery-token-ca-cert-hash sha256:be1d81ab25802d18171e696d72b62a876cbd40faca7beaf27ef03f5ccf305fd8
# 将配置文件写到HOME
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Troubleshoot
- 初始化集群时,kubelet进程一直无法启动,
journalctl -xefu kubelet
查看进程日志 提示Failed to run kubelet" err=“failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "systemd"
- 通过
sudo docker info | grep -i cgroup
查看docker实际使用的cgroup driver,这里应该是cgroupfs
- 修改
/var/lib/kubelet/kubeadm-flags.env
文件,在KUBELET_KUBEADM_ARGS
变量中添加--cgroup-driver=cgroupfs
标识
- 修改完成后通过
systemctl status kubelet
观察kubelet进程状态为active
搭建POD网络组件(flannel)
# k8s
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# k8s v1.11
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml
完成后,可以通过kubectl get pods --all-namespaces
查看kube-flannel-ds-xxx
,coredns-xxx
pod启动成功。
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-kljnw 1/1 Running 1 (114m ago) 127m
kube-flannel kube-flannel-ds-rrzcp 1/1 Running 1 (114m ago) 120m
kube-system coredns-6d4b75cb6d-gx2pj 1/1 Running 1 (114m ago) 131m
kube-system coredns-6d4b75cb6d-n84gd 1/1 Running 1 (114m ago) 131m
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config export KUBECONFIG=/etc/kubernetes/admin.conf
新节点加入
# 即初始化master/cluster的时候返回的token/hash信息, 直接copy过来执行即可
$ kubeadm join 192.168.56.3:6443 --token y95vm1.jb705h1i0s0bcy1j --discovery-token-ca-cert-hash sha256:210b042c21aa09fcf65dc152581f78532d8d8e17dfd191cb3a440441dec28d80
成果&故障排查
执行kubectl cluster-info
命令看到以下信息,即表示集群搭建成功:
Kubernetes control plane is running at https://192.168.56.3:6443
CoreDNS is running at https://192.168.56.3:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
通过kubectl get nodes
可查看集群节点信息。
- Runtime CRI 配置出错
kubeadm init
默认使用/etc/containerd/config.toml
配置文件,若遇到以下错误:$ sudo kubeadm init --config=config.yaml W1125 12:58:32.733485 26426 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [init] Using Kubernetes version: v1.19.4 [preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR CRI]: container runtime is not running: output: time="2020-11-25T12:58:32Z" level=fatal msg="getting status of runtime failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService" , error: exit status 1 [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
可通过移除默认配置文件修复(唯一一行disable配置导致问题):
rm /etc/containerd/config.toml # 默认的配置包含disabled_plugins = ["cri"]可能有问题, 删除之, 使用默认 systemctl restart containerd kubeadm init <args>