Apache-spark – Spark resources not fully allocated on Amazon EMR

apache-sparkemrhadoop-yarn

I'm trying to maximize cluster usage for a simple task.

Cluster is 1+2 x m3.xlarge, runnning Spark 1.3.1, Hadoop 2.4, Amazon AMI 3.7

The task reads all lines of a text file and parse them as csv.

When I spark-submit a task as a yarn-cluster mode, I get one of the following result:

0 executor: job waits infinitely until I manually kill it
1 executor: job under utilize resources with only 1 machine working
OOM when I do not assign enough memory on the driver

What I would have expected:

Spark driver run on cluster master with all memory available, plus 2 executors with 9404MB each (as defined by install-spark script).

Sometimes, when I get a "successful" execution with 1 executor, cloning and restarting the step ends up with 0 executor.

I created my cluster using this command:

aws emr --region us-east-1 create-cluster --name "Spark Test"
--ec2-attributes KeyName=mykey 
--ami-version 3.7.0 
--use-default-roles 
--instance-type m3.xlarge 
--instance-count 3 
--log-uri s3://mybucket/logs/ 
--bootstrap-actions Path=s3://support.elasticmapreduce/spark/install-spark,Args=["-x"] 
--steps Name=Sample,Jar=s3://elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--master,yarn,--deploy-mode,cluster,--class,my.sample.spark.Sample,s3://mybucket/test/sample_2.10-1.0.0-SNAPSHOT-shaded.jar,s3://mybucket/data/],ActionOnFailure=CONTINUE

With some step variations including:

–driver-memory 8G –driver-cores 4 –num-executors 2

install-spark script with -x produces the following spark-defaults.conf:

$ cat spark-defaults.conf
spark.eventLog.enabled  false
spark.executor.extraJavaOptions         -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
spark.driver.extraJavaOptions         -Dspark.driver.log.level=INFO
spark.executor.instances        2
spark.executor.cores    4
spark.executor.memory   9404M
spark.default.parallelism       8

Update 1

I get the same behavior with a generic JavaWordCount example:

/home/hadoop/spark/bin/spark-submit --verbose --master yarn --deploy-mode cluster --driver-memory 8G --class org.apache.spark.examples.JavaWordCount /home/hadoop/spark/lib/spark-examples-1.3.1-hadoop2.4.0.jar s3://mybucket/data/

However, if I remove the '–driver-memory 8G', the task gets assigned 2 executors and finishes correctly.

So, what's the matter with driver-memory preventing my task to get executors?

Should the driver be executed on the cluster's master node alongside with Yarn master container as explained here?

How do I give more memory to my spark job driver? (Where collects and some other useful operations arise)

Best Answer

The solution to maximize cluster usage is to forget about the '-x' parameter when installing spark on EMR and to adjust executors memory and cores by hand.

This post gives a pretty good explanation of how resources allocation is done when running Spark on YARN.

One important thing to remember is that all executors must have the same resources allocated! As we speak, Spark does not support heterogeneous executors. (Some work is currently being made to support GPUs but it's another topic)

So in order to get maximum memory allocated to the driver while maximizing memory to the executors, I should split my nodes like this (this slideshare gives good screenshots at page 25):

Node 0 - Master (Yarn resource manager)
Node 1 - NodeManager(Container(Driver) + Container(Executor))
Node 2 - NodeManager(Container(Executor) + Container(Executor))

NOTE: Another option would be to spark-submit with --master yarn --deploy-mode client from the master node 0. Are there any counter example this is a bad idea?

In my example, I can have at most have 3 executors of 2 vcores with 4736 MB each + a driver with same specs.

4736 memory is derived from the value of yarn.nodemanager.resource.memory-mb defined in /home/hadoop/conf/yarn-site.xml. On a m3.xlarge, it is set to 11520 mb (see here for all values associated to each instance types)

Then, we get:

(11520 - 1024) / 2 (executors per nodes) = 5248 => 5120 (rounded down to 256 mb increment as defined in yarn.scheduler.minimum-allocation-mb)

7% * 5120 = 367 rounded up to 384 (memory overhead) will become 10% in spark 1.4

5120 - 384 = 4736

Related Solutions

Apache-spark – Apache Spark: The number of cores vs. the number of executors

To hopefully make all of this a little more concrete, here’s a worked example of configuring a Spark app to use as much of the cluster as possible: Imagine a cluster with six nodes running NodeManagers, each equipped with 16 cores and 64GB of memory. The NodeManager capacities, yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores, should probably be set to 63 * 1024 = 64512 (megabytes) and 15 respectively. We avoid allocating 100% of the resources to YARN containers because the node needs some resources to run the OS and Hadoop daemons. In this case, we leave a gigabyte and a core for these system processes. Cloudera Manager helps by accounting for these and configuring these YARN properties automatically.

The likely first impulse would be to use --num-executors 6 --executor-cores 15 --executor-memory 63G. However, this is the wrong approach because:

63GB + the executor memory overhead won’t fit within the 63GB capacity of the NodeManagers. The application master will take up a core on one of the nodes, meaning that there won’t be room for a 15-core executor on that node. 15 cores per executor can lead to bad HDFS I/O throughput.

A better option would be to use --num-executors 17 --executor-cores 5 --executor-memory 19G. Why?

This config results in three executors on all nodes except for the one with the AM, which will have two executors. --executor-memory was derived as (63/3 executors per node) = 21. 21 * 0.07 = 1.47. 21 – 1.47 ~ 19.

The explanation was given in an article in Cloudera's blog, How-to: Tune Your Apache Spark Jobs (Part 2).

Hibernate – Can’t get Hibernate Validator working with Spring MessageSource

This had me stumped for a while, but the problem is that you need to register with Spring the Validator used to validate @Controller methods (thanks to this answer for that insight!)

So if you are using XML config do something along these lines:

<bean id="validator" class="org.springframework.validation.beanvalidation.LocalValidatorFactoryBean">
    <property name="messageInterpolator" ref="messageSource"/>
</bean>

<mvc:annotation-driven validator="validator"/>

And if you are using javaconfig, do something like this:

@EnableWebMVC
@Configuration
public MyWebAppContext extends WebMvcConfigurerAdapter {

@Bean
public LocalValidatorFactoryBean validator() {
    LocalValidatorFactoryBean validatorFactoryBean = new LocalValidatorFactoryBean();
    validatorFactoryBean.setValidationMessageSource(messageSource);
    return validatorFactoryBean;
}

@Override
public Validator getValidator() {
    return validator();
}

(see Spring Web MVC framework documentation)

Best Answer

Related Solutions

Apache-spark – Apache Spark: The number of cores vs. the number of executors

Hibernate – Can’t get Hibernate Validator working with Spring MessageSource

Related Topic