Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- hive> CREATE TABLE taxi_data
- > (
- > vendor_id INT,
- > tpep_pickup_datetime TIMESTAMP,
- > tpep_dropoff_datetime TIMESTAMP,
- > passenger_count INT,
- > trip_distance DOUBLE,
- > pulocation_id INT,
- > dolocation_id INT,
- > ratecode_id INT,
- > store_and_fwd_flag string,
- > payment_type INT,
- > fare_amount DOUBLE,
- > extra DOUBLE,
- > mta_tax DOUBLE,
- > improvement_surcharge DOUBLE,
- > tip_amount DOUBLE,
- > tolls_amount DOUBLE,
- > total_amount DOUBLE
- > )
- > ROW format delimited
- > FIELDS TERMINATED BY ','
- > LINES TERMINATED BY '\n'
- > stored AS parquet
- > location 'hdfs:///user/root/2020'
- > TBLPROPERTIES ("skip.header.line.count"="1");
- OK
- TIME taken: 0.08 seconds
- hive> SHOW TABLES;
- OK
- dim_vendor
- taxi_data
- TIME taken: 0.026 seconds, Fetched: 2 ROW(s)
- hive> SELECT 8 FROM taxi_data LIMIT 5;
- OK
- Failed WITH exception java.io.IOException:java.lang.RuntimeException: hdfs://rc1b-dataproc-m-st8trv5rgtmo8iwo.mdb.yandexcloud.net/USER/root/2020/yellow_tripdata_2020-01.csv IS NOT a Parquet file. expected magic NUMBER at tail [80, 65, 82, 49] but found [44, 48, 13, 10]
- TIME taken: 0.218 seconds
Advertisement
Add Comment
Please, Sign In to add comment