K8S - 安装 - 个人技术文章分享

Master

Master上的etcd、kube-apiserver、kube-controller-manager、kube-scheduler服务

Node

Node 上的 kubelet、kube-proxy服务

准备

swapoff -a
vim /etc/fstab
# 注释以下这行
#/dev/mapper/centos-swap swap                    swap    defaults        0 0

# 查看是否修改成功
free -m 

# docker Cgroup Driver: cgroupfs

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

systemctl daemon-reload
systemctl restart kubelet

环境要求

内存至少2G

关闭swap

# 临时关闭
swapoff -a
# 永久关闭
vim /etc/fstab
# 注释以下行
# /dev/mapper/cl-swap     swap                    swap    defaults        0 0

# 节点机器(临时有效)
$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
节点机器(重启有效)
$ vim /etc/rc.d/rc.local
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
chmod +x /etc/rc.d/rc.local

为各节点修改主机名 # 查看主机名 hostnamectl # 修改主机名(永久有效,需要重启) hostnamectl set-hostname k8s-master hostnamectl set-hostname k8s-node1 hostnamectl set-hostname k8s-node2

安装 `kubelet kubeadm kubectl`

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

systemctl enable kubelet.service

创建集群

1、如果能访问外网，使用如下：

kubeadm init \
--control-plane-endpoint "192.168.3.99:6443" \
--pod-network-cidr=192.168.0.0/16 \
--apiserver-advertise-address=192.168.3.99

2、如果不能访问外网，使用如下：

kubeadm init --image-repository registry.aliyuncs.com/google_containers \
--control-plane-endpoint "192.168.3.99:6443" \
--pod-network-cidr=192.168.0.0/16 \
--apiserver-advertise-address=192.168.3.99

安装Docker

# Set up repository
sudo yum install -y yum-utils device-mapper-persistent-data lvm2

# Use Aliyun Docker
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 展示可安装版本
yum list docker-ce --showduplicates
# 安装指定版本
yum install docker-ce-20.10.12-3.el7

systemctl enable docker && systemctl start docker

安装 `kubeadm` `kubelet` `kubectl`

# 设置安装源
cat > /etc/yum.repos.d/kubernetes.repo<<EOF
[kubernetes]
name=Kubernetes Repo
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
enabled=1
EOF

# 查看安装源是否添加成功
yum repolist

# 安装
yum install -y kubelet-1.23.1 kubeadm-1.23.1 kubectl-1.23.1
# 设置自启动
systemctl enable kubelet

Master节点

kubeadm init \
--apiserver-advertise-address=10.0.0.10 \
--image-repository=registry.aliyuncs.com/google_containers \
--kubernetes-version=v1.23.1 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16

可能遇到的问题:

swap交换分区未禁用

docker启动模式为cgroup而官方推荐使用systemd

安装失败重置kubeadm reset

安装完成后显示

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.0.10:6443 --token m5o6mu.ih83y4t0n18f6lnr \
        --discovery-token-ca-cert-hash sha256:e72fdc1952c7d5407b2dd9b02e37abc399e43792ad6213ceeaea414da2d86720

kubeadm token generate

kubeadm token create wdajx7.cgd203zp0bld21bw --print-join-command --ttl=0 #根据token输出添加命令

安装flannel网络

$ kubectl apply -f https://raw.githubusercontent.com/mrlxxx/kube-flannel.yml/master/kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged configured
serviceaccount/flannel unchanged
configmap/kube-flannel-cfg unchanged
daemonset.apps/kube-flannel-ds-amd64 unchanged
daemonset.apps/kube-flannel-ds-arm64 unchanged
daemonset.apps/kube-flannel-ds-arm unchanged
daemonset.apps/kube-flannel-ds-ppc64le unchanged
daemonset.apps/kube-flannel-ds-s390x unchanged
unable to recognize "https://raw.githubusercontent.com/mrlxxx/kube-flannel.yml/master/kube-flannel.yml": no matches for kind "ClusterRole" in version "rbac.authorization.k8s.io/v1beta1"
unable to recognize "https://raw.githubusercontent.com/mrlxxx/kube-flannel.yml/master/kube-flannel.yml": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"

可能出现ping quay.io不通的情况,检测DNS是否正常
> $ vim /etc/resolv.conf
> nameserver 114.114.114.114
> nameserver 8.8.8.8
> ```
>
> 修改文件后,重新拉取
$ kubectl get pods -n kube-system $ docker pull quay.io/coreos/flannel:v0.12.0-amd64 $ kubectl logs kube-flannel-ds-amd64-2zf8n -n kube-system I0118 17:20:12.219469 1 main.go:518] Determining IP address of default interface I0118 17:20:12.234351 1 main.go:531] Using interface with name enp0s3 and address 10.0.2.15 I0118 17:20:12.234463 1 main.go:548] Defaulting external address to interface address (10.0.2.15) W0118 17:20:12.234530 1 client_config.go:517] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. E0118 17:20:12.434664 1 main.go:243] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-2zf8n': pods "kube-flannel-ds-amd64-2zf8n" is forbidden: User "system:serviceaccount:kube-system:flannel" cannot get resource "pods" in API group "" in the namespace "kube-system"
原因：`DaemonSet`、`Deployment`、`StatefulSet` 和 `ReplicaSet` 在 v1.16 中将不再从 `extensions/v1beta1`、`apps/v1beta1` 或 `apps/v1beta2` 提供服务

解决方法是：

>  将yml配置文件内的api接口修改为 `apps/v1`  导致原因为之间使用的kubernetes 版本是`1.15.x`版本，`1.16.x 及以上版本`放弃部分API支持 

所以有两种办法

1. 使用旧版本的k8s
2. 修改配置

即上面配置中的

```javascript
apiVersion: extensions/v1beta1

修改为

apiVersion: apps/v1

把

apiVersion: rbac.authorization.k8s.io/v1beta1

修改为

apiVersion: rbac.authorization.k8s.io/v1

加入节点

加入节点前调整docker的运行模式,修改/etc/docker/daemon.json文件: { "exec-opts": ["native.cgroupdriver=systemd"] }

重启docker

   systemctl daemon-reload && systemctl restart docker

加入节点 kubeadm join 10.0.0.10:6443 --token m48gg6.5kg9bpix6r6d2fiv --discovery-token-ca-cert-hash sha256:af4ac296852c87c30fa9a5064ff580d8d43e794334bf9eaedc417386d7a3cbf0

如果出现以下错误:
   > [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
   > [ERROR Port-10250]: Port 10250 is in use
   > [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
   > ```
   >
   > 执行以下命令后再重新加入则可
   >
   > ```
   > kubeadm reset
   > ```
   >
   > 结果出现
   >
   > ```
   > Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
   > ```
   >
   > 表示加入成功,可以到master节点中执行`kubectl get nodes`查看节点状态

#### 问题
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get Unfortunately, an error has occurred: timed out waiting for the condition

    This error is likely caused by:
            - The kubelet is not running
            - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
            - 'systemctl status kubelet'
            - 'journalctl -xeu kubelet'

    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.

    Here is one example how you may list all Kubernetes containers running in docker:
            - 'docker ps -a | grep kube | grep -v pause'
            Once you have found the failing container, you can inspect its logs with:
            - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher


> 执行`journalctl -xeu kubelet`找到错误日志:
>
> `"Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""`

修改`/etc/docker/daemon.json`文件:

{ "exec-opts": ["native.cgroupdriver=systemd"] }


重启docker

systemctl daemon-reload && systemctl restart docker ```