Don't forget to create account on our site to get access to more material made only for free registered user.  

CCD-410 Certifcation CCA-500 Hadoop Administrator Exam HBase Certifcation CCB-400 Data Science Certifcation Hadoop Training with Hands On Lab

Problem scenario 2 :  You have been given below information Avro schema as below

Please find all 83 Solved scenarios for CCA-175

Cloudera Hadoop and Spark Developer Certifications

{

  "type" : "record",

  "name" : "orders_partition4",

  "doc" : "Hive partitioned table schema ",

  "fields" : [ {

    "name" : "order_id",

    "type" : [ "int", "null" ]

  }, {

    "name" : "order_date",

    "type" : [ "long", "null" ]

  }, {

    "name" : "order_customer_id",

    "type" : [ "int", "null" ]

  }, {

    "name" : "order_status",

    "type" : [ "string", "null" ]

  } ],

  "tableName" : "orders_partition4"

}

 

1. Create a managed hive table based on above schema and two additional columns as below

 

order_value with default value as -9999 (This column can not be null)

order_description with default value as "Not Defined" (This column can store null)

 

2. While creating table use the data from below location.

hdfs:///user/hive/warehouse/retail_stage.db/orders

 

__________________________________________________________________________________________________________________________________________________________

Solution :

 

Step 1 : Create a schema file named order.avsc locally and modiy it for two additional column as below also add the dfault value for these new columns.

 

{

  "type" : "record",

  "name" : "orders_partition1",

  "doc" : "Hive partitioned table schema ",

  "fields" : [ {

    "name" : "order_id",

    "type" : [ "int", "null" ]

  }, {

    "name" : "order_date",

    "type" : [ "long", "null" ]

  }, {

    "name" : "order_customer_id",

    "type" : [ "int", "null" ]

  }, {

    "name" : "order_status",

    "type" : [ "string", "null" ]

  }, {

    "name" : "order_value",

    "type" : "int",

    "default" : -9999

  } , {

    "name" : "order_description",

    "type" : [ "string", "null" ],

    "default" : "Not Defined"

  }  ],

  "tableName" : "orders_partition2"

}

 

Step 2 : Create an avsc file in hdfs at below location (You can use either hue or create file locally and upload to hdfs)

hdfs://quickstart.cloudera/user/cloudera/retail_stage1/sqoop_import_orders2.avsc

 

Step 3 : 

 

Create a managed table

 

CREATE TABLE orders_partition4 (

order_id int,

order_date bigint,

order_customer_id int,

order_status string,

order_value int,

order_description string

)

STORED AS AVRO

LOCATION 'hdfs:////user/hive/warehouse/retail_stage.db/orders'

TBLPROPERTIES ('avro.schema.url'='hdfs://quickstart.cloudera/user/cloudera/retail_stage1/sqoop_import_orders2.avsc');

 

Step 4 : Check the tables;

select * from orders_partition4;

 

describe orders_partition4;

 

</q>

<img>CCA175.JPG</img>

<as>

<a>

Video URL (In Progress) : 

http://cca175cloudera.training4exam.com/ 

 

Visit www.hadoopexam.com for updates

-------------------------------------------

 

{

  "type" : "record",

  "name" : "orders_partition4",

  "doc" : "Hive partitioned table schema ",

  "fields" : [ {

    "name" : "order_id",

    "type" : [ "int", "null" ]

  }, {

    "name" : "order_date",

    "type" : [ "long", "null" ]

  }, {

    "name" : "order_customer_id",

    "type" : [ "int", "null" ]

  }, {

    "name" : "order_status",

    "type" : [ "string", "null" ]

  } ],

  "tableName" : "orders_partition4"

}