Don't forget to create account on our site to get access to more material made only for free registered user.  

Q39. You have been given following code written in Scala and Spark


Below is the content for IBM.csv file







Now you have written following code, in interactive shell


val myRDD = sc.textFile("data.csv")

val splittedRDD =","))

val distinctRDD =>(x[0],1)).distinct()

val priceDataRDD =>(x[1]))


In above program, which of the following RDD should be cached.


A. myRDD

B. splittedRDD

C. distinctRDD

D. priceDataRDD


Ands  : A

Exp : If we are using same RDD, again and again then it is advisable to cache or persist the same. Cached RDD has already been computed 

and the data is already in memory.We can reuse this RDD without using any additional compute or memory resources.

You have no rights to post comments