Guest User

Untitled

a guest
Oct 20th, 2017
91
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.46 KB | None | 0 0
  1. """
  2. A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
  3. Represents an immutable, partitioned collection of elements that can be
  4. operated on in parallel.
  5. """
  6.  
  7. def __init__(self, jrdd, ctx, jrdd_deserializer=AutoBatchedSerializer(PickleSerializer())):
  8. self._jrdd = jrdd
  9. self.is_cached = False
  10. self.is_checkpointed = False
  11. self.ctx = ctx
  12. self._jrdd_deserializer = jrdd_deserializer
  13. self._id = jrdd.id()
  14. self.partitioner = None
Add Comment
Please, Sign In to add comment