Advertisement
Guest User

Untitled

a guest
Mar 22nd, 2019
103
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.02 KB | None | 0 0
  1. sc = spark.sparkContext
  2.  
  3. # A JSON dataset is pointed to by path.
  4. # The path can be either a single text file or a directory storing text files
  5. path = "examples/src/main/resources/people.json"
  6. peopleDF = spark.read.json(path)
  7.  
  8. # The inferred schema can be visualized using the printSchema() method
  9. peopleDF.printSchema()
  10. # root
  11. # |-- age: long (nullable = true)
  12. # |-- name: string (nullable = true)
  13.  
  14. # Creates a temporary view using the DataFrame
  15. peopleDF.createOrReplaceTempView("people")
  16.  
  17. # SQL statements can be run by using the sql methods provided by spark
  18. teenagerNamesDF = spark.sql("SELECT name FROM people WHERE age BETWEEN 13 AND 19")
  19. teenagerNamesDF.show()
  20. # +------+
  21. # | name|
  22. # +------+
  23. # |Justin|
  24. # +------+
  25.  
  26. # Alternatively, a DataFrame can be created for a JSON dataset represented by
  27. # an RDD[String] storing one JSON object per string
  28. jsonStrings = ['{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}']
  29. otherPeopleRDD = sc.parallelize(jsonStrings)
  30. otherPeople = spark.read.json(otherPeopleRDD)
  31. otherPeople.show()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement