Guest User

Untitled

a guest
Feb 21st, 2018
245
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.84 KB | None | 0 0
  1. from random import randint
  2. df.fillna(randint(14, 46), 'age').show()
  3.  
  4. import pyspark.sql.functions as F
  5. from pyspark.sql.functions import lit
  6. from pyspark.sql.types import IntegerType
  7. from random import randint
  8.  
  9. df = sqlContext.createDataFrame(
  10. [(1, "a", 23.0), (3, "B", -23.0)], ("x1", "x2", "x3"))
  11.  
  12. df = (df
  13. .withColumn("x4", F.lit(None).cast(IntegerType()))
  14. .withColumn("x5", F.lit(None).cast(IntegerType()))
  15. )
  16.  
  17. df.na.fill({'x4':randint(0,100)}).show()
  18. df.withColumn('x5', F.coalesce(F.col('x5'), (F.round(F.rand()*100)))).show()
  19.  
  20.  
  21. +---+---+-----+---+----+
  22. | x1| x2| x3| x4| x5|
  23. +---+---+-----+---+----+
  24. | 1| a| 23.0| 9|null|
  25. | 3| B|-23.0| 9|null|
  26. +---+---+-----+---+----+
  27. +---+---+-----+----+----+
  28. | x1| x2| x3| x4| x5|
  29. +---+---+-----+----+----+
  30. | 1| a| 23.0|null|44.0|
  31. | 3| B|-23.0|null| 2.0|
  32. +---+---+-----+----+----+
Add Comment
Please, Sign In to add comment