Spark-Standalone

安全：默认关闭

手工启动集群：

使用./sbin/start-master.sh，启动后，master将会打印出spark://HOST:PORT,可以用来连接workers
监控：默认为http://localhost:8080/
启动worker连接到master：./sbin/start-slave.sh <master-spark-URL>

Argument	Meaning
`-h HOST`, `--host HOST`	Hostname to listen on
`-i HOST`, `--ip HOST`	Hostname to listen on (deprecated, use -h or --host)
`-p PORT`, `--port PORT`	Port for service to listen on (default: 7077 for master, random for worker)
`--webui-port PORT`	Port for web UI (default: 8080 for master, 8081 for worker)
`-c CORES`, `--cores CORES`	Total CPU cores to allow Spark applications to use on the machine (default: all available); only on worker
`-m MEM`, `--memory MEM`	Total amount of memory to allow Spark applications to use on the machine, in a format like 1000M or 2G (default: your machine's total RAM minus 1 GB); only on worker
`-d DIR`, `--work-dir DIR`	Directory to use for scratch space and job output logs (default: SPARK_HOME/work); only on worker
`--properties-file FILE`	Path to a custom Spark properties file to load (default: conf/spark-defaults.conf)

初始化conf/slaves文件

初始化conf/spark-env.sh

Environment Variable	Meaning
`SPARK_MASTER_HOST`	Bind the master to a specific hostname or IP address, for example a public one.
`SPARK_MASTER_PORT`	Start the master on a different port (default: 7077).
`SPARK_MASTER_WEBUI_PORT`	Port for the master web UI (default: 8080).
`SPARK_MASTER_OPTS`	Configuration properties that apply only to the master in the form "-Dx=y" (default: none). See below for a list of possible options.
`SPARK_LOCAL_DIRS`	Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks.
`SPARK_WORKER_CORES`	Total number of cores to allow Spark applications to use on the machine (default: all available cores).
`SPARK_WORKER_MEMORY`	Total amount of memory to allow Spark applications to use on the machine, e.g. `1000m`, `2g` (default: total memory minus 1 GB); note that each application's individual memory is configured using its `spark.executor.memory`property.
`SPARK_WORKER_PORT`	Start the Spark worker on a specific port (default: random).
`SPARK_WORKER_WEBUI_PORT`	Port for the worker web UI (default: 8081).
`SPARK_WORKER_DIR`	Directory to run applications in, which will include both logs and scratch space (default: SPARK_HOME/work).
`SPARK_WORKER_OPTS`	Configuration properties that apply only to the worker in the form "-Dx=y" (default: none). See below for a list of possible options.
`SPARK_DAEMON_MEMORY`	Memory to allocate to the Spark master and worker daemons themselves (default: 1g).
`SPARK_DAEMON_JAVA_OPTS`	JVM options for the Spark master and worker daemons themselves in the form "-Dx=y" (default: none).
`SPARK_DAEMON_CLASSPATH`	Classpath for the Spark master and worker daemons themselves (default: none).
`SPARK_PUBLIC_DNS`	The public DNS name of the Spark master and workers (default: none).

SPARK_MASTER_OPTS

Property Name	Default	Meaning
`spark.deploy.retainedApplications`	200	The maximum number of completed applications to display. Older applications will be dropped from the UI to maintain this limit.
`spark.deploy.retainedDrivers`	200	The maximum number of completed drivers to display. Older drivers will be dropped from the UI to maintain this limit.
`spark.deploy.spreadOut`	true	Whether the standalone cluster manager should spread applications out across nodes or try to consolidate them onto as few nodes as possible. Spreading out is usually better for data locality in HDFS, but consolidating is more efficient for compute-intensive workloads.
`spark.deploy.defaultCores`	(infinite)	Default number of cores to give to applications in Spark's standalone mode if they don't set `spark.cores.max`. If not set, applications always get all available cores unless they configure `spark.cores.max` themselves. Set this lower on a shared cluster to prevent users from grabbing the whole cluster by default.
`spark.deploy.maxExecutorRetries`	10	Limit on the maximum number of back-to-back executor failures that can occur before the standalone cluster manager removes a faulty application. An application will never be removed if it has any running executors. If an application experiences more than`spark.deploy.maxExecutorRetries` failures in a row, no executors successfully start running in between those failures, and the application has no running executors then the standalone cluster manager will remove the application and mark it as failed. To disable this automatic removal, set `spark.deploy.maxExecutorRetries` to `-1`.
`spark.worker.timeout`	60	Number of seconds after which the standalone deploy master considers a worker lost if it receives no heartbeats.

SPARK_WORKER_OPTS

Property Name	Default	Meaning
`spark.worker.cleanup.enabled`	false	Enable periodic cleanup of worker / application directories. Note that this only affects standalone mode, as YARN works differently. Only the directories of stopped applications are cleaned up.
`spark.worker.cleanup.interval`	1800 (30 minutes)	Controls the interval, in seconds, at which the worker cleans up old application work dirs on the local machine.
`spark.worker.cleanup.appDataTtl`	604800 (7 days, 7 * 24 * 3600)	The number of seconds to retain application work directories on each worker. This is a Time To Live and should depend on the amount of available disk space you have. Application logs and jars are downloaded to each application work dir. Over time, the work dirs can quickly fill up disk space, especially if you run jobs very frequently.
`spark.storage.cleanupFilesAfterExecutorExit`	true	Enable cleanup non-shuffle files(such as temp. shuffle blocks, cached RDD/broadcast blocks, spill files, etc) of worker directories following executor exits. Note that this doesn't overlap with `spark.worker.cleanup.enabled`, as this enables cleanup of non-shuffle files in local directories of a dead executor, while `spark.worker.cleanup.enabled` enables cleanup of all files/subdirectories of a stopped and timeout application. This only affects Standalone mode, support of other cluster manangers can be added in the future.
`spark.worker.ui.compressedLogFileLengthCacheSize`	100	For compressed log files, the uncompressed file can only be computed by uncompressing the files. Spark caches the uncompressed file size of compressed log files. This property controls the cache size.

连接到集群：

./bin/spark-shell --master spark://IP:PORT

出错重跑：针对standalone模式需要在提交spark-submit时配置--supervise，使用如下命令kill一直失败的应用
```
./bin/spark-class org.apache.spark.deploy.Client kill <master url> <driver ID>
```

资源配置

APP：

export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=<value>"

Executor: spark.executor.cores

监控：默认8080
日志：默认(SPARK_HOME/work),STDOUTSTDERR
与Hadoop交互：hdfs:// URL 如 hdfs://<namenode>:9000/path
HA：master使用zookeeper