300 Questions for OREILLY DataBricks Apache Spark Developer Certification + 5 Page Revision notes

Question : 7 You have executed below Python spark code in Spark Shell

1. >>> lines = sc.textFile("hadoopexam.txt") 

2. >>> lines.count() 

3. 127

4. >>> lines.first() 

5. u'# Apache Spark'

Which is the driver code?  

1.     SparkContext 

2.     Spark Shell itself 

3.     lines 

4.     None of the above 

Correct Answer : 2 Exp : At a high level, every Spark application consists of a driver program that launches various

parallel operations on a cluster. The driver program contains your application's main function and defines distributed datasets on the cluster, then applies operations to them.

In the examples above, the driver program was the Spark shell itself, and you could just type in the operations you wanted to run. 

Driver programs access Spark through a SparkContext object, which represents a connection to a computing cluster. In the shell, a SparkContext is automatically created for

you, as the variable called sc.

