SHARE
TWEET

Untitled

a guest Oct 23rd, 2019 81 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. How to trim minutes and seconds from date filed in Pyspark datarame.
  2. Different apporaches to do that
  3.  
  4. Input : 2019-01-31 23:16:28
  5. output : 2019-01-31 23:00:00
  6.  
  7.   Not effecient
  8.  
  9.      df.withColumn('tpep_pickup_datetime', concat(df.tpep_pickup_datetime.substr(0, 13), lit(‘:00:00’)))
  10.        
  11.   Effecient then one mentioned above
  12.  
  13.         df.withColumn(‘tpep_pickup_datetime',(round(unix_timestamp(col("tpep_pickup_datetime")) / 3600) * 3600)
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top