Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- # Define date range
- START_DATE = dt.datetime(2019,8,15,20,30,0)
- END_DATE = dt.datetime(2019,8,16,15,43,0)
- # Generate date range with pandas
- timerange = pd.date_range(start=START_DATE, end=END_DATE, freq='15min')
- # Convert to timestamp
- timestamps = [int(x) for x in timerange.values.astype(np.int64) // 10 ** 9]
- # Create pyspark dataframe from the above timestamps
- (spark.createDataFrame(dates, IntegerType())
- .withColumn('value_date', sf.from_unixtime('value'))
- .drop('value')
- .withColumnRenamed('value_date', 'date').show())
- +-------------------+
- | date|
- +-------------------+
- |2019-08-15 20:30:00|
- |2019-08-15 20:45:00|
- |2019-08-15 21:00:00|
- |2019-08-15 21:15:00|
- |2019-08-15 21:30:00|
- |2019-08-15 21:45:00|
- |2019-08-15 22:00:00|
- |2019-08-15 22:15:00|
- |2019-08-15 22:30:00|
- |2019-08-15 22:45:00|
- |2019-08-15 23:00:00|
- |2019-08-15 23:15:00|
- |2019-08-15 23:30:00|
- |2019-08-15 23:45:00|
- |2019-08-16 00:00:00|
- |2019-08-16 00:15:00|
- |2019-08-16 00:30:00|
- |2019-08-16 00:45:00|
- |2019-08-16 01:00:00|
- |2019-08-16 01:15:00|
- +-------------------+
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement