Problem scenario 2 : You have been given below information Avro schema as below
Please find all 83 Solved scenarios for CCA-175
Cloudera Hadoop and Spark Developer Certifications
{
"type" : "record",
"name" : "orders_partition4",
"doc" : "Hive partitioned table schema ",
"fields" : [ {
"name" : "order_id",
"type" : [ "int", "null" ]
}, {
"name" : "order_date",
"type" : [ "long", "null" ]
}, {
"name" : "order_customer_id",
"type" : [ "int", "null" ]
}, {
"name" : "order_status",
"type" : [ "string", "null" ]
} ],
"tableName" : "orders_partition4"
}
1. Create a managed hive table based on above schema and two additional columns as below
order_value with default value as -9999 (This column can not be null)
order_description with default value as "Not Defined" (This column can store null)
2. While creating table use the data from below location.
hdfs:///user/hive/warehouse/retail_stage.db/orders
__________________________________________________________________________________________________________________________________________________________
Solution :
Step 1 : Create a schema file named order.avsc locally and modiy it for two additional column as below also add the dfault value for these new columns.
{
"type" : "record",
"name" : "orders_partition1",
"doc" : "Hive partitioned table schema ",
"fields" : [ {
"name" : "order_id",
"type" : [ "int", "null" ]
}, {
"name" : "order_date",
"type" : [ "long", "null" ]
}, {
"name" : "order_customer_id",
"type" : [ "int", "null" ]
}, {
"name" : "order_status",
"type" : [ "string", "null" ]
}, {
"name" : "order_value",
"type" : "int",
"default" : -9999
} , {
"name" : "order_description",
"type" : [ "string", "null" ],
"default" : "Not Defined"
} ],
"tableName" : "orders_partition2"
}
Step 2 : Create an avsc file in hdfs at below location (You can use either hue or create file locally and upload to hdfs)
hdfs://quickstart.cloudera/user/cloudera/retail_stage1/sqoop_import_orders2.avsc
Step 3 :
Create a managed table
CREATE TABLE orders_partition4 (
order_id int,
order_date bigint,
order_customer_id int,
order_status string,
order_value int,
order_description string
)
STORED AS AVRO
LOCATION 'hdfs:////user/hive/warehouse/retail_stage.db/orders'
TBLPROPERTIES ('avro.schema.url'='hdfs://quickstart.cloudera/user/cloudera/retail_stage1/sqoop_import_orders2.avsc');
Step 4 : Check the tables;
select * from orders_partition4;
describe orders_partition4;
</q>
<img>CCA175.JPG</img>
<as>
<a>
Video URL (In Progress) :
http://cca175cloudera.training4exam.com/
Visit www.hadoopexam.com for updates
-------------------------------------------
{
"type" : "record",
"name" : "orders_partition4",
"doc" : "Hive partitioned table schema ",
"fields" : [ {
"name" : "order_id",
"type" : [ "int", "null" ]
}, {
"name" : "order_date",
"type" : [ "long", "null" ]
}, {
"name" : "order_customer_id",
"type" : [ "int", "null" ]
}, {
"name" : "order_status",
"type" : [ "string", "null" ]
} ],
"tableName" : "orders_partition4"
}