It has been informed by www.HadoopExam.com (Mostly used platform for BigData/Hadoop/NoSQL training and Certification preparation) that Cloudera has changed their current syllabus or exam objective of Cloudera's most popular certification named CCA175:Hadoop and Spark Developer on 7th Mar 2017. Complete analysis done by www.HadoopExam.com is below. Where it has been clearly segregated , what is remain from old syllabus and which objective is newly added. Please refer below table. HadoopExam also informed that their technical team in process of adding new practice scenarios for their existing CCA175 Hadoop and Spark Developer Simulator for newly added objectives. If you have not subscribed for updates then please do so here. Subscribe for updates
New Syllabus |
Old Syllabus |
Remarks |
Data Ingest : The skills to transfer data between external systems and your cluster. This includes the following: |
Data Ingest:The skills to transfer data between external systems and your cluster. This includes the following: |
Objective Remain Same |
Import data from a MySQL database into HDFS using Sqoop |
Import data from a MySQL database into HDFS using Sqoop |
Remain Same |
Export data to a MySQL database from HDFS using Sqoop |
Export data to a MySQL database from HDFS using Sqoop |
Remain Same |
Change the delimiter and file format of data during import using Sqoop |
Change the delimiter and file format of data during import using Sqoop |
Remain Same |
Ingest real-time and near-real-time streaming data into HDFS |
Ingest real-time and near-real time (NRT) streaming data into HDFS using Flume |
You can use either Flume, Spark Streaming or any ither Streaming tool |
Process streaming data as it is loaded onto the cluster |
|
New Objective |
Load data into and out of HDFS using the Hadoop File System commands |
Load data into and out of HDFS using the Hadoop File System (FS) commands |
Remain Same |
Transform, Stage, and Store:Convert a set of data values in a given format stored in HDFS into new data values or a new data format and write them into HDFS. |
Transform, Stage, Store: Convert a set of data values in a given format stored in HDFS into new data values and/or a new data format and write them into HDFS. This includes writing Spark applications in both Scala and Python: |
Objective remain same |
Load RDD data from HDFS for use in Spark applications |
Load data from HDFS and store results back to HDFS using Spark |
Objective has been broken in two parts |
Write the results from an RDD back into HDFS using Spark |
||
|
Join disparate datasets together using Spark |
Remain : Moved to below Objective |
|
Calculate aggregate statistics (e.g., average or sum) using Spark |
Remain : Moved to below Objective |
|
Filter data into a smaller dataset using Spark |
Remain : Moved to below Objective |
|
Write a query that produces ranked or sorted data using Spark |
Remain : Moved to below Objective |
Read and write files in a variety of file formats |
|
New |
Perform standard extract, transform, load (ETL) processes on data |
|
New |
Data Analysis:Use Spark SQL to interact with the metastore programmatically in your applications. Generate reports by using queries against loaded data. |
Data Analysis:Use Data Definition Language (DDL) to create tables in the Hive metastore for use by Hive and Impala. |
Objective technology changed from Hive/Imapala to Spark. Focus will be given on SparkSQL |
Use metastore tables as an input source or an output sink for Spark applications |
|
New |
Understand the fundamentals of querying datasets in Spark |
|
New |
Filter data using Spark |
|
Remain Same |
Write queries that calculate aggregate statistics |
|
Remain Same |
Join disparate datasets using Spark |
|
Remain Same |
Produce ranked or sorted data |
|
Remain Same |
|
|
|
|
Read and/or create a table in the Hive metastore in a given schema |
Removed |
|
Extract an Avro schema from a set of datafiles using avro-tools |
Removed |
|
Create a table in the Hive metastore using the Avro file format and an external schema file |
Removed |
|
Improve query performance by creating partitioned tables in the Hive metastore |
Removed |
|
Evolve an Avro schema by changing JSON files |
Removed |
|
|
|
Configuration: This is a practical exam and the candidate should be familiar with all aspects of generating a result, not just writing code. |
|
New: Objective Introduced |
Supply command-line options to change your application configuration, such as increasing available memory |
|
New |