SHARE
TWEET

Untitled

a guest Jun 26th, 2019 47 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. dataset_json = sc.textFile("data/my_data.json")
  2.     dataset = dataset_json.map(lambda x: json.loads(x))
  3.     dataset.persist()
  4.     dataset.take(2)
  5.      
  6. [{'movie': 'movie_name1',
  7.   'release_date': '2011-01-11T10:26:12Z',
  8.   'actor': 'actor_name1'},
  9.  {'movie': 'movie_name2',
  10.   'release_date': '2010-04-08T04:14:23Z',
  11.   'actor': 'actor_name2'}]
  12.      
  13. dataset2 = dataset.filter(lambda line: line.lookup('release_date'))
  14.     dataset2.first()
  15.      
  16. attributes = dataset.filter (lambda x: x.keys())
  17.     attributes.take(2)
  18.      
  19. [{'movie': 'movie_name1',
  20.   'release_date': '2011-01-11T10:26:12Z',
  21.   'actor': 'actor_name1'},
  22.  {'movie': 'movie_name2',
  23.   'release_date': '2010-04-08T04:14:23Z',
  24.   'actor': 'actor_name2'}]
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top