SHARE
TWEET

Untitled

a guest Mar 22nd, 2019 81 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. sc = spark.sparkContext
  2.  
  3. # A JSON dataset is pointed to by path.
  4. # The path can be either a single text file or a directory storing text files
  5. path = "examples/src/main/resources/people.json"
  6. peopleDF = spark.read.json(path)
  7.  
  8. # The inferred schema can be visualized using the printSchema() method
  9. peopleDF.printSchema()
  10. # root
  11. #  |-- age: long (nullable = true)
  12. #  |-- name: string (nullable = true)
  13.  
  14. # Creates a temporary view using the DataFrame
  15. peopleDF.createOrReplaceTempView("people")
  16.  
  17. # SQL statements can be run by using the sql methods provided by spark
  18. teenagerNamesDF = spark.sql("SELECT name FROM people WHERE age BETWEEN 13 AND 19")
  19. teenagerNamesDF.show()
  20. # +------+
  21. # |  name|
  22. # +------+
  23. # |Justin|
  24. # +------+
  25.  
  26. # Alternatively, a DataFrame can be created for a JSON dataset represented by
  27. # an RDD[String] storing one JSON object per string
  28. jsonStrings = ['{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}']
  29. otherPeopleRDD = sc.parallelize(jsonStrings)
  30. otherPeople = spark.read.json(otherPeopleRDD)
  31. otherPeople.show()
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top