pmkhlv

Untitled

Mar 23rd, 2022
5,637
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
SQL 1.18 KB | None | 0 0
  1. hive> CREATE TABLE taxi_data
  2.     > (
  3.     > vendor_id INT,
  4.     > tpep_pickup_datetime TIMESTAMP,
  5.     > tpep_dropoff_datetime TIMESTAMP,
  6.     > passenger_count INT,
  7.     > trip_distance DOUBLE,
  8.     > pulocation_id INT,
  9.     > dolocation_id INT,
  10.     > ratecode_id INT,
  11.     > store_and_fwd_flag string,
  12.     > payment_type INT,
  13.     > fare_amount DOUBLE,
  14.     > extra DOUBLE,
  15.     > mta_tax DOUBLE,
  16.     > improvement_surcharge DOUBLE,
  17.     > tip_amount DOUBLE,
  18.     > tolls_amount DOUBLE,
  19.     > total_amount DOUBLE
  20.     > )
  21.     > ROW format delimited
  22.     > FIELDS TERMINATED BY ','
  23.     > LINES TERMINATED BY '\n'
  24.     > stored AS parquet
  25.     > location 'hdfs:///user/root/2020'
  26.     > TBLPROPERTIES ("skip.header.line.count"="1");
  27. OK
  28. TIME taken: 0.08 seconds
  29. hive> SHOW TABLES;
  30. OK
  31. dim_vendor
  32. taxi_data
  33. TIME taken: 0.026 seconds, Fetched: 2 ROW(s)
  34. hive> SELECT 8 FROM taxi_data LIMIT 5;
  35. OK
  36. Failed WITH exception java.io.IOException:java.lang.RuntimeException: hdfs://rc1b-dataproc-m-st8trv5rgtmo8iwo.mdb.yandexcloud.net/USER/root/2020/yellow_tripdata_2020-01.csv IS NOT a Parquet file. expected magic NUMBER at tail [80, 65, 82, 49] but found [44, 48, 13, 10]
  37. TIME taken: 0.218 seconds
  38.  
Advertisement
Add Comment
Please, Sign In to add comment