[spark] spark2.4运行在k8s

准备:

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.3", GitCommit:"ca643a4d1f7bfe34773c74f79527be4afd95bf39", GitTreeState:"clean", BuildDate:"2021-07-15T21:04:39Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:50Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

实测发现,server如果是最新的1.19以上的,会出现兼容问题,spark3.0才支持

构建镜像:

主要参考这篇:https://www.jianshu.com/p/da20133ecfea

授信:

spark在2.2之后开始支持k8s,3.0正式支持k8s,所以有很多功能其实阉割了。

最快捷的解决方案写在最下面

在k8s中运行spark最复杂的问题,我认为就是授信问题。关于授信问题,spark2.4的官方文档写的很不好,官方文档地址: http://spark.apache.org/docs/2.4.1/running-on-kubernetes.html

在授信问题上提供了很多参数如: spark.kubernetes.authenticate.submission.caCertFile,spark.kubernetes.authenticate.submission.clientKeyFile,配置这些证书过程很复杂

  • 异常:
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

这个是因为自签证书不受信任,需要将ca.pem导入keystore。参考https://blog.csdn.net/ljskr/article/details/84570573

  • 异常
2020/10/15 11:09:49.147 WARN WatchConnectionManager : Exec Failure: HTTP 403, Status: 403 - pods "spark-pi-1602731387162-driver" is forbidden: User "system:anonymous" cannot watch resource "pods" in API group "" in the namespace "default"

这是客户端不受信任。解决方案是spark-submit添加参数

--conf spark.kubernetes.authenticate.submission.clientKeyFile=/root/admin-key.pem
--conf spark.kubernetes.authenticate.submission.clientCertFile=/root/admin.pem

或者执行:

kubectl create clusterrolebinding test:anonymous --clusterrole=cluster-admin --user=system:anonymous

这里的证书有一个配置错误都会出问题

所以最佳配置方案是:

将.kube文件拷贝到$HOME目录下

原理是:

spark使用的是io.fabric8库,虽然spark提供了一堆的参数,但是该库默认还是会寻找~/.kube/config文件。

代码逻辑: 

https://github.com/apache/spark/blob/branch-2.4/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala#L66

https://github.com/fabric8io/kubernetes-client/blob/74cc63df9b6333d083ee24a6ff3455eaad0a6da8/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/Config.java#L538

而这里的授信认证仅仅在spark-submit时生效。当命令提交后,生成driver pod之后,授信文件pem就失去职能了

driver阶段授信:

spark on k8s大致的流程是driver pod去创建和销毁executor pod,所以driver pod的权限需要很大才行。这里需要配置RBAC,参考 http://spark.apache.org/docs/2.4.1/running-on-kubernetes.html#rbac

配置完成之后,需要告知driver以什么身份执行,因此还需要配置参数:

--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark

最后

k8s client和server版本兼容很烂

k8s server1.15需要搭载的client版本大概是4.6,而spark2.4.7才更新这个包版本

参考: https://www.waitingforcode.com/apache-spark/setting-up-apache-spark-kubernetes-microk8s/read#unknownhostexception_kubernetes_default_svc_try_again

因此测试时格外要注意client和server版本问题

最后发布命令如下:

/Users/zhouwenyang/Desktop/tmp/spark/bin/spark-submit 
--master k8s://https://knode1:6443 --deploy-mode cluster 
--name spark-pi --class org.apache.spark.examples.SparkPi 
--conf spark.executor.instances=2 
--conf spark.kubernetes.container.image=vm2173:5000/spark:2.4.7 
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark 
--conf spark.kubernetes.submission.waitAppCompletion=false 
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.7.jar
原文地址:https://www.cnblogs.com/zhouwenyang/p/15070754.html