Guest User

Untitled

a guest
Nov 30th, 2018
109
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 7.34 KB | None | 0 0
  1. Using sqoop, import products_replica table from MYSQL into hdfs such that fields are separated by a '|' and lines are separated by '\n'. Null values are represented as -1 for numbers and "NOT-AVAILABLE" for strings. Only records with product id greater than or equal to 1 and less than or equal to 1000 should be imported and use 3 mappers for importing. The destination file should be stored as a text file to directory /user/cloudera/problem5/products-text.
  2. Using sqoop, import products_replica table from MYSQL into hdfs such that fields are separated by a '*' and lines are separated by '\n'. Null values are represented as -1000 for numbers and "NA" for strings. Only records with product id less than or equal to 1111 should be imported and use 2 mappers for importing. The destination file should be stored as a text file to directory /user/cloudera/problem5/products-text-part1.
  3. Using sqoop, import products_replica table from MYSQL into hdfs such that fields are separated by a '*' and lines are separated by '\n'. Null values are represented as -1000 for numbers and "NA" for strings. Only records with product id greater than 1111 should be imported and use 5 mappers for importing. The destination file should be stored as a text file to directory /user/cloudera/problem5/products-text-part2.
  4. Using sqoop merge data available in /user/cloudera/problem5/products-text-part1 and /user/cloudera/problem5/products-text-part2 to produce a new set of files in /user/cloudera/problem5/products-text-both-parts
  5.  
  6.  
  7. sqoop-import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera --table products_replica --fields-terminated-by "|" --lines-terminated-by "\n" --null-non-string -1 --null-string "NA" --where "product_id >=1 and product_id <=1000" -m 3 --target-dir /user/cloudera/problem5/products-text --outdir /home/cloudera/sqoop1
  8.  
  9. sqoop-import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera --table products_replica --fields-terminated-by "*" --lines-terminated-by "\n" --null-non-string -1000 --null-string "NA" --where "product_id <= 1111" -m 2 --target-dir /user/cloudera/problem5/products-text-part1 --outdir /home/cloudera/sqoop2
  10.  
  11. sqoop-import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera --table products_replica --fields-terminated-by "*" --lines-terminated-by "\n" --null-non-string -1000 --null-string "NA" --where "product_id >= 1111" -m 5 --target-dir /user/cloudera/problem5/products-text-part2 --outdir /home/cloudera/sqoop3
  12.  
  13. sqoop-merge --class-name products_replica \
  14. --jar-file /tmp/sqoop-cloudera/compile/44e59c363d94f3d1c12f74bd129e206d/products_replica.jar \
  15. --merge-key product_id \
  16. --new-data /user/cloudera/problem5/products-text-part2 \
  17. --onto /user/cloudera/problem5/products-text-part1 \
  18. --target-dir /user/cloudera/problem5/products-text-both-parts
  19.  
  20. Using sqoop do the following. Read the entire steps before you create the sqoop job.
  21. create a sqoop job Import Products_replica table as text file to directory /user/cloudera/problem5/products-incremental. Import all the records.
  22. insert three more records to Products_replica from mysql
  23. run the sqoop job again so that only newly added records can be pulled from mysql
  24. insert 2 more records to Products_replica from mysql
  25. run the sqoop job again so that only newly added records can be pulled from mysql
  26. Validate to make sure the records have not be duplicated in HDFS
  27.  
  28.  
  29. sqoop-job --create firstSqoopJob -- import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera --table products_replica --as-textfile --target-dir /user/cloudera/problem5/products-incremental --incremental append --last-value 0 --check-column product_id
  30.  
  31. sqoop-job --exec firstSqoopJob
  32.  
  33. insert into products_replica values (1346,2,'something 1','something 2',300.00,'not avaialble',3,'STRONG');
  34. insert into products_replica values (1347,5,'something 787','something 2',356.00,'not avaialble',3,'STRONG');
  35.  
  36. sqoop-job --exec firstSqoopJob
  37.  
  38. insert into products_replica values (1376,4,'something 1376','something 2',1.00,'not avaialble',3,'WEAK');
  39. insert into products_replica values (1365,4,'something 1376','something 2',10.00,'not avaialble',null,'NOT APPLICABLE');
  40.  
  41. sqoop-job --exec firstSqoopJob
  42.  
  43. Using sqoop do the following. Read the entire steps before you create the sqoop job.
  44. create a hive table in database named problem5 using below command
  45. create table products_hive (product_id int, product_category_id int, product_name string, product_description string, product_price float, product_imaage string,product_grade int, product_sentiment string);
  46. create a sqoop job Import Products_replica table as hive table to database named problem5. name the table as products_hive.
  47. insert three more records to Products_replica from mysql
  48. run the sqoop job again so that only newly added records can be pulled from mysql
  49. insert 2 more records to Products_replica from mysql
  50. run the sqoop job again so that only newly added records can be pulled from mysql
  51. Validate to make sure the records have not been duplicated in Hive table
  52.  
  53.  
  54. sqoop-job --create sencondSqoopJob -- import --connect :jdbc:mysql:quickstart.cloudera:3306/retail_db --username retail_dba --password cloudera --table products_replica --hive-import --hive-database problem5 --hive-table products_hive --hive-overwrite --check-column product_id --incremental append --last-value 0
  55.  
  56. sqoop-job --exec sencondSqoopJob
  57.  
  58. insert into products_replica values (1378,4,'something 1376','something 2',10.00,'not avaialble',null,'NOT APPLICABLE');
  59. insert into products_replica values (1379,4,'something 1376','something 2',10.00,'not avaialble',null,'NOT APPLICABLE');
  60.  
  61. sqoop-job --exec sencondSqoopJob
  62.  
  63. select * from products_hive
  64.  
  65.  
  66. Using sqoop do the following. .
  67. insert 2 more records into products_hive table using hive.
  68. create table in mysql using below command
  69. create table products_external (product_id int(11) primary Key, product_grade int(11), product_category_id int(11), product_name varchar(100), product_description varchar(100), product_price float, product_impage varchar(500), product_sentiment varchar(100));
  70. export data from products_hive (hive) table to (mysql) products_external table.
  71. insert 2 more records to Products_hive table from hive
  72. export data from products_hive table to products_external table.
  73. Validate to make sure the records have not be duplicated in mysql table
  74.  
  75.  
  76. insert into table products_hive values (1380,4,'something 1380','something 2',8.00,'not avaialble',3,'NOT APPLICABLE');
  77. insert into table products_hive values (1381,4,'something 1380','something 2',8.00,'not avaialble',3,'NOT APPLICABLE');
  78.  
  79.  
  80. create table products_external (product_id int(11) primary Key, product_grade int(11), product_category_id int(11), product_name varchar(100), product_description varchar(100), product_price float, product_impage varchar(500), product_sentiment varchar(100));
  81.  
  82. sqoop-export --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera --table products_external --export-dir /user/hive/warehouse/problem5.db/products_hive --input-fields-terminated-by '\001' --input-null-non-string 0 --input-null-string 'null' --update-mode allowinsert --update-key product_id --columns "product_id,product_category_id,product_name,product_description,product_price,product_impage,product_grade,product_sentiment"
Add Comment
Please, Sign In to add comment