Spark on K8S

配置 spark 用户

apiVersion: v1
kind: ServiceAccount
metadata:
  name: spark
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: spark-role
rules:
- apiGroups: [""]
  resources: ["*"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: spark-role-binding
  namespace: default
subjects:
- kind: ServiceAccount
  name: spark
  namespace: default
roleRef:
  kind: Role
  name: spark-role
  apiGroup: rbac.authorization.k8s.io

配置 spark 容器,会在这个容器里以 client 模式 submit spark 程序,所以这个容器也会作为 driver

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spark-client
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spark-client
      component: spark-client
  template:
    metadata:
      labels:
        app: spark-client
        component: spark-client
    spec:
      containers:
      - name: spark-client
        image: spark-py:2.4.6
        workingDir: /opt/spark
        command: ["/bin/bash", "-c", "while true;do echo hello;sleep 6000;done"]
      serviceAccountName: spark

配置 service,使得 spark executor 可以连接上 spark driver,任意端口都可以

apiVersion: v1
kind: Service
metadata:
  namespace: default
  name: spark-client-service
spec:
  selector:
    app: spark-client
  ports:
    - protocol: TCP
      port: 7321
      targetPort: 7321
  clusterIP: None

登陆 spark 容器,以 client 模式提交 spark,指定 spark.driver.host 和 spark.driver.port

bin/spark-submit 
    --master k8s://https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT_HTTPS} 
    --deploy-mode client 
    --name spark-test 
    --conf spark.executor.instances=3 
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark 
    --conf spark.kubernetes.container.image=spark-py:2.4.6 
    --conf spark.driver.host=spark-client-service 
    --conf spark.driver.port=7321 
    /opt/spark/examples/src/main/python/wordcount.py 
    /opt/spark/examples/src/main/python/wordcount.py
原文地址:https://www.cnblogs.com/moonlight-lin/p/14269919.html