Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- from pyspark.sql import HiveContext
- Query=""" select dt
- from default.content_publisher_events_log
- where dt between '20170415' and '20170419'
- """
- hive_context = HiveContext(sc)
- user_data = hive_context.sql(Query)
- user_data.count()
- 0 #that's the result
- >>> sqlContext.sql("show tables").show()
- +--------+--------------------+-----------+
- |database| tableName|isTemporary|
- +--------+--------------------+-----------+
- | default|content_publisher...| false|
- | default| feed_installer_log| false|
- | default|keyword_based_ads...| false|
- | default|search_providers_log| false|
- +--------+--------------------+-----------+
- >>> user_data.printSchema()
- root
- |-- dt: string (nullable = true)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement