Ubuntu 16.04的k8s安装配置

相关软件 

1、kubeadm

安装步骤 

apt-get update

  

1、禁用所有交换分区

swapoff -a

/etc/fstab

可以用free命令查看禁用情况 

root@gpu-10-0-1-24:~# free
              total        used        free      shared  buff/cache   available
Mem:      528016312     6131652   343432968     6595072   178451692   512492696
Swap:             0           0           0

2、关闭防火墙

systemctl stop firewalld
systemctl disable firewalld

3、禁用SELinux 

setenforce 0

安装网络插件flannel

  

kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.1.18 --kubernetes-version=v1.11.1 --ignore-preflight-errors=all //--skip-preflight-checks选项已经弃用

  

报错

[preflight] Activating the kubelet service
failure loading ca certificate: couldn't load the private key file /etc/kubernetes/pki/ca.key: open /etc/kubernetes/pki/ca.key: no such file or directory

把自定义pki密钥拷到对应目录下。

sudo: unable to resolve host gpu-10-0-1-18

在/etc/hosts文件中加上主机名映射。

getenforce

添加node节点

kubeadm join 10.0.0.39:6443 --token 4g0p8w.w5p29ukwvitim2ti 
--discovery-token-ca-cert-hash sha256:21d0adbfcb409dca97e655641573b2ee51c
77a212f194e20a307cb459e5f77c8
kubeadm token list
kubeadm token create --print-join-command
apt-get update && apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

  

新加的节点,get nodes的ROLES为<none> 

kubectl get pods -n kube-system | grep flannel

  

kubectl get pods -n kube-system -o wide | grep gpu-10-0-1-24

参考链接

https://tomoyadeng.github.io/blog/2018/10/12/k8s-in-ubuntu18.04/index.html

kubeadm token list empty: 

https://www.serverlab.ca/tutorials/containers/kubernetes/how-to-add-workers-to-kubernetes-clusters/

https://stackoverflow.com/questions/51380934/unable-to-connect-worker-node-to-kubernetes-cluster 

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

join显示成功,但是get nodes没有:

https://github.com/kubernetes/kubernetes/issues/61224

The connection to the server localhost:8080 was refused - did you specify the right host or port?

https://www.jianshu.com/p/6fa06b9bbf6a

Attempting to reclaim ephemeral-storage

ImagePullBackOff

kubectl -n kube-system logs kube-flannel-ds-jpp96 -c install-cni

node ready并不代表网络插件flannel通了。

flannel也是在镜像中启动的。

k8s可以有多个master节点。

给节点添加role标签

kubectl label node k8s-node1 node-role.kubernetes.io/worker=worker

systemctl restart kubelet会触发联网拉镜像

root@cpu-10-0-3-9:~# ks init xps-kubeflow
INFO Using context "kubernetes-admin@kubernetes" from kubeconfig file "/root/.kube/config"
INFO Creating environment "default" with namespace "default", pointing to "version:v1.8.0" cluster at address "https://10.0.3.9:6443"
INFO Generating ksonnet-lib data at path '/root/xps-kubeflow/lib/ksonnet-lib/v1.8.0'
root@cpu-10-0-3-9:~/xps-k8s# kubectl create -f xps_crd.yaml
customresourcedefinition.apiextensions.k8s.io/xps.tencent.com created

  

kubectl get crd

  

Pod sandbox changed, it will be killed and re-created.

  

docker run --security-opt=no-new-privileges --cap-drop=ALL --network=none -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.11

  

emptydir只在pod范围内共享 所以只要保证一个pod一个容器就行

k8s默认不会调度到master节点上

kubectl taint nodes --all node-role.kubernetes.io/master-

  

查看所有mxjobs

kubectl get mxjobs.kubeflow.org

  

分配pod到node:

https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity 

原文地址:https://www.cnblogs.com/yangwenhuan/p/11484859.html