img

本文是一篇关于在 Ubuntu 22.04 上搭建 K8s 集群的指南,包含了使用 containerd 作为容器运行时的安装和配置、使用 WireGuard 跨 VPC/云供应商搭建 K8s 集群、K8s 组件的安装、使用 kubeadm 初始化控制平面和加入集群、集群网络插件的安装以及安装 Kubernetes Dashboard 等内容。

由于篇幅有限,本文对很多命令没有作详细的解释。如有错误,欢迎指出!

0. 说明😃

  • 环境:OS Ubuntu 22.04,K8s v1.26.3 ,Containerd 1.6.19
  • 本文基本上都选择使用了阿里云镜像,如果你不需要也可以不用
  • 为了避免太多 sudo 影响观感,本文所有命令默认在 root 权限下运行
  1. 【所有节点】容器运行时🐳

安装 🛠️

在开始之前,我们需要先安装容器运行时。这里我们选择使用 containerd,不选 Docker 是因为它没有实现 CRI(容器运行时接口,Container Runtime Interface),需要安装配置额外的服务,比较麻烦。

# 安装必要的软件包
apt update && apt install -y apt-transport-https ca-certificates curl gnupg lsb-release
# 安装软件源的密钥
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# 添加软件源
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
# 安装 containerd
apt update && apt install -y containerd.io

配置⚙️

新建 /etc/systemd/system/cri-containerd.service 文件。

[Unit]
Description=containerd container runtime for kubernetes
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd --config //etc/cri-containerd/config.toml

Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target

新建 /etc/cri-containerd/config.toml 文件。

version = 2
# persistent data location
root = "/var/lib/cri-containerd"
# runtime state information
state = "/run/cri-containerd"
plugin_dir = ""
disabled_plugins = []
required_plugins = []
# set containerd's OOM score
oom_score = 0

[grpc]
  address = "/run/cri-containerd/cri-containerd.sock"
  tcp_address = ""
  tcp_tls_cert = ""
  tcp_tls_key = ""
  # socket uid
  uid = 0
  # socket gid
  gid = 0
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

[debug]
  address = ""
  format = "json"
  uid = 0
  gid = 0
  level = ""

[metrics]
  address = "127.0.0.1:1338"
  grpc_histogram = false

[cgroup]
  path = ""

[timeouts]
  "io.containerd.timeout.shim.cleanup" = "5s"
  "io.containerd.timeout.shim.load" = "5s"
  "io.containerd.timeout.shim.shutdown" = "3s"
  "io.containerd.timeout.task.state" = "2s"

[plugins]
  [plugins."io.containerd.gc.v1.scheduler"]
    pause_threshold = 0.02
    deletion_threshold = 0
    mutation_threshold = 100
    schedule_delay = "0s"
    startup_delay = "100ms"
  [plugins."io.containerd.grpc.v1.cri"]
    disable_tcp_service = true
    stream_server_address = "127.0.0.1"
    stream_server_port = "0"
    stream_idle_timeout = "4h0m0s"
    enable_selinux = false
    selinux_category_range = 1024
    sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
    stats_collect_period = 10
    # systemd_cgroup = false
    enable_tls_streaming = false
    max_container_log_line_size = 16384
    disable_cgroup = false
    disable_apparmor = false
    restrict_oom_score_adj = false
    max_concurrent_downloads = 3
    disable_proc_mount = false
    unset_seccomp_profile = ""
    tolerate_missing_hugetlb_controller = true
    disable_hugetlb_controller = true
    ignore_image_defined_volumes = false
    [plugins."io.containerd.grpc.v1.cri".containerd]
      snapshotter = "overlayfs"
      default_runtime_name = "runc"
      no_pivot = false
      disable_snapshot_annotations = false
      discard_unpacked_layers = false
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          runtime_type = "io.containerd.runc.v2"
          pod_annotations = []
          container_annotations = []
          privileged_without_host_devices = false
          base_runtime_spec = ""
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            # SystemdCgroup enables systemd cgroups.
            SystemdCgroup = true
            # BinaryName is the binary name of the runc binary.
            # BinaryName = "runc"
            # BinaryName = "crun"
            # NoPivotRoot disables pivot root when creating a container.
            # NoPivotRoot = false

            # NoNewKeyring disables new keyring for the container.
            # NoNewKeyring = false

            # ShimCgroup places the shim in a cgroup.
            # ShimCgroup = ""

            # IoUid sets the I/O's pipes uid.
            # IoUid = 0

            # IoGid sets the I/O's pipes gid.
            # IoGid = 0

            # Root is the runc root directory.
            Root = ""

            # CriuPath is the criu binary path.
            # CriuPath = ""

            # CriuImagePath is the criu image path
            # CriuImagePath = ""

            # CriuWorkPath is the criu work path.
            # CriuWorkPath = ""
    [plugins."io.containerd.grpc.v1.cri".cni]
      bin_dir = "/opt/cni/bin"
      conf_dir = "/etc/cni/net.d"
      max_conf_num = 1
      conf_template = ""
    [plugins."io.containerd.grpc.v1.cri".registry]
      config_path = "/etc/cri-containerd/certs.d"
      [plugins."io.containerd.grpc.v1.cri".registry.headers]
        # Foo = ["bar"]
    [plugins."io.containerd.grpc.v1.cri".image_decryption]
      key_model = ""
    [plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
      tls_cert_file = ""
      tls_key_file = ""
  [plugins."io.containerd.internal.v1.opt"]
    path = "/opt/cri-containerd"
  [plugins."io.containerd.internal.v1.restart"]
    interval = "10s"
  [plugins."io.containerd.metadata.v1.bolt"]
    content_sharing_policy = "shared"
  [plugins."io.containerd.monitor.v1.cgroups"]
    no_prometheus = false
  [plugins."io.containerd.runtime.v2.task"]
    platforms = ["linux/amd64"]
  [plugins."io.containerd.service.v1.diff-service"]
    default = ["walking"]
  [plugins."io.containerd.snapshotter.v1.devmapper"]
    root_path = ""
    pool_name = ""
    base_image_size = ""
    async_remove = false

启动 🚀

systemctl enable cri-containerd
systemctl start cri-containerd
  1. 【可选】网络配置🌐

如果你的所有节点都已经在同一个内网,那么恭喜你,直接到下一步即可。如果你的节点像我一样不在同一个网络,希望跨 VPC/云供应商搭建 K8s 集群,那么你可以参考接下来的流程,使用 WireGuard 将所有节点加入到同一个网络中。

【所有节点】安装🛠️

apt update && apt install -y wireguard
wg genkey | tee /etc/wireguard/private.key # 生成私钥
cat /etc/wireguard/private.key | wg pubkey | tee /etc/wireguard/public.key # 生成公钥

【Master】配置⚙️

新建/etc/wireguard/wg0.conf文件。

[Interface]
PrivateKey = xxxxx # 这里填主节点的私钥
Address = 10.8.0.1/24
ListenPort = 51820
SaveConfig = true
PostUp = ufw route allow in on wg0 out on eth0
PostUp = iptables -t nat -I POSTROUTING -o eth0 -j MASQUERADE
PreDown = ufw route delete allow in on wg0 out on eth0
PreDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE

编辑/etc/sysctl.conf文件,加入下面这行:

net.ipv4.ip_forward=1

使配置生效并启用 WireGuard 服务:

sysctl -p
systemctl enable [email protected]
systemctl start [email protected]

【其他节点】配置⚙️

新建/etc/wireguard/wg0.conf文件。

[Interface]
PrivateKey = xxxxx # 这里填节点的私钥
Address = 10.8.0.2/24

[Peer]
PublicKey = xxxxx # 这里填主节点的公钥
AllowedIPs = 10.8.0.0/24
Endpoint = <ip>:51820 # 主节点的IP

启用 WireGuard 服务:

systemctl enable [email protected]
systemctl start [email protected]

在主节点上添加该节点:

wg set wg0 peer <该节点的公钥> allowed-ips 10.8.0.2

完成以后,可以在任意一个节点运行wg查看连接状态。

img

  1. 【所有节点】K8S 组件🛠️

安装 kubeadm、kubelet 和 kubectl:

curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt update && apt install -y kubelet kubeadm kubectl

修改内核运行参数并应用:

cat <<EOF | tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
modprobe br_netfilter
sysctl --system

如果你使用了 WireGuard 组网或者其他原因需要手动指定节点 IP,可以编辑/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,在ExecStart=/usr/bin/kubele ...这一行后面加上--node-ip=<IP>

  1. 【Master】控制平面🤖

使用 kubeadm 初始化控制平面:

kubeadm init \
      --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
      --pod-network-cidr 10.244.0.0/16 \
      --cri-socket /run/cri-containerd/cri-containerd.sock \
      --v 5

如果你使用了 WireGuard 组网或者其他原因需要手动指定节点 IP,可以添加--apiserver-advertise-address=<IP>参数。

成功以后应该会输出以下内容:

img

接下来根据上面输出的提示,获得集群的操作权限:

# root 用户
export KUBECONFIG=/etc/kubernetes/admin.conf
# 非 root 用户
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

接下来我们就可以使用 kubectl 管理集群了!

  1. 【其他节点】加入集群👥

使用 kubeadm 加入集群:在其他节点上运行上一步输出的指令即可,可以添加--cri-socket /run/cri-containerd/cri-containerd.sock参数指定容器运行时。

成功以后,我们可以通过 kubectl 查看集群中的节点:

img

我们也可以通过 crictl 查看每个节点实际运行的容器:CONTAINER_RUNTIME_ENDPOINT=unix:///run/cri-containerd/cri-containerd.sock crictl ps

img

  1. 【Master】集群网络🌐

完成上面的操作后,我们的节点还是处于 NotReady 的状态,这是因为我们还缺少一个可用的网络插件。

这里我们以 flannel 为例进行安装:

kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

如果你使用了 WireGuard 组网或者其他原因需要手动指定网络,可以先下载kube-flannel.yml,在- --kube-subnet-mgr这一行后面添加- --iface=wg0,保存后再kubectl apply -f <文件名>

完成后,等一小会再查看节点状态,已经都是 Ready 了~

img

至此,我们的 K8S 集群已经搭建完毕了!🎉🎉🎉

  1. 【可选】Kubernetes Dashboard🖥

Dashboard 是基于网页的 Kubernetes 用户界面,我们可以通过它方便地对集群进行一些管理。部署起来也是非常的简单!

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

访问 Dashboard:

# ssh 连接并转发8001端口
ssh -L localhost:8001:localhost:8001 <destination>  
# 在节点上运行kubectl反向代理
kubectl proxy

然后,我们就可以通过 http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/ 进行访问。

img

选择使用 Token 登录,通过执行下面的命令生成 Token:

kubectl create sa dashboard-admin -n kube-system
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
kubectl -n kube-system create token dashboard-admin

成功进入 Dashboard!

img

  1. 其他👀

在默认情况下,控制平面节点是会带上一个 NoSchedule 的污点,如果我们希望这个节点也能参与调度运行服务,可以通过这个命令去掉污点:

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

最后再来看看 K8S 占用,我其中一个 1h2g 的小机在部署后 CPU 使用率大概升高了 7%,内存 25%,个人感觉还是可以接受的。毕竟你都用上 K8S 了,配置肯定只会比我的更好吧?

img

参考文档📚

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/

https://docker-practice.github.io/zh-cn/kubernetes/setup/kubeadm.html

https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

https://www.wireguard.com/quickstart/

https://www.digitalocean.com/community/tutorials/how-to-set-up-wireguard-on-ubuntu-20-04