Advertisement
Guest User

Untitled

a guest
Jun 26th, 2019
65
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.72 KB | None | 0 0
  1. dataset_json = sc.textFile("data/my_data.json")
  2. dataset = dataset_json.map(lambda x: json.loads(x))
  3. dataset.persist()
  4. dataset.take(2)
  5.  
  6. [{'movie': 'movie_name1',
  7. 'release_date': '2011-01-11T10:26:12Z',
  8. 'actor': 'actor_name1'},
  9. {'movie': 'movie_name2',
  10. 'release_date': '2010-04-08T04:14:23Z',
  11. 'actor': 'actor_name2'}]
  12.  
  13. dataset2 = dataset.filter(lambda line: line.lookup('release_date'))
  14. dataset2.first()
  15.  
  16. attributes = dataset.filter (lambda x: x.keys())
  17. attributes.take(2)
  18.  
  19. [{'movie': 'movie_name1',
  20. 'release_date': '2011-01-11T10:26:12Z',
  21. 'actor': 'actor_name1'},
  22. {'movie': 'movie_name2',
  23. 'release_date': '2010-04-08T04:14:23Z',
  24. 'actor': 'actor_name2'}]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement