Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- >>> df['profile']
- date
- 2015-01-01 00:00:00 3.000000
- 2015-01-01 01:00:00 3.000143
- 2015-01-01 02:00:00 3.000287
- 2015-01-01 03:00:00 3.000430
- 2015-01-01 04:00:00 3.000574
- ...
- 2015-12-31 20:00:00 2.999426
- 2015-12-31 21:00:00 2.999570
- 2015-12-31 22:00:00 2.999713
- 2015-12-31 23:00:00 2.999857
- Freq: H, Name: profile, Length: 8760
- ### Deviation on monthly basis
- >>> dev_monthly = np.random.uniform(0.5, 1.5, len(df['profile'].groupby(df.index.month).aggregate(np.sum)))
- >>> df['profile_monthly'] = (df['profile'].groupby(df.index.month).aggregate(np.sum) * dev_monthly).reindex(df)
- >>> df['profile_monthly']
- date
- 2015-01-01 00:00:00 NaN
- 2015-01-01 01:00:00 NaN
- 2015-01-01 02:00:00 NaN
- ...
- 2015-12-31 22:00:00 NaN
- 2015-12-31 23:00:00 NaN
- Freq: H, Name: profile_monthly, Length: 8760
- In [105]: df = DataFrame({'profile': normal(3, 0.1, size=10000)}, pd.date_range(start='2015-01-
- 01', freq='H', periods=10000))
- In [106]: df['profile_monthly'] = df.profile.resample('M', how='sum')
- In [107]: df
- Out[107]:
- profile profile_monthly
- 2015-01-01 00:00:00 2.8328 NaN
- 2015-01-01 01:00:00 3.0607 NaN
- 2015-01-01 02:00:00 3.0138 NaN
- 2015-01-01 03:00:00 3.0402 NaN
- 2015-01-01 04:00:00 3.0335 NaN
- 2015-01-01 05:00:00 3.0087 NaN
- 2015-01-01 06:00:00 3.0557 NaN
- 2015-01-01 07:00:00 2.9280 NaN
- 2015-01-01 08:00:00 3.1359 NaN
- 2015-01-01 09:00:00 2.9681 NaN
- 2015-01-01 10:00:00 3.1240 NaN
- 2015-01-01 11:00:00 3.0635 NaN
- 2015-01-01 12:00:00 2.9206 NaN
- 2015-01-01 13:00:00 3.0714 NaN
- 2015-01-01 14:00:00 3.0688 NaN
- 2015-01-01 15:00:00 3.0703 NaN
- 2015-01-01 16:00:00 2.9102 NaN
- 2015-01-01 17:00:00 2.9368 NaN
- 2015-01-01 18:00:00 3.0864 NaN
- 2015-01-01 19:00:00 3.2124 NaN
- 2015-01-01 20:00:00 2.8988 NaN
- 2015-01-01 21:00:00 3.0659 NaN
- 2015-01-01 22:00:00 2.7973 NaN
- 2015-01-01 23:00:00 3.0824 NaN
- 2015-01-02 00:00:00 3.0199 NaN
- ... ...
- [10000 rows x 2 columns]
- In [108]: df.dropna()
- Out[108]:
- profile profile_monthly
- 2015-01-31 2.9769 2230.9931
- 2015-02-28 2.9930 2016.1045
- 2015-03-31 2.7817 2232.4096
- 2015-04-30 3.1695 2158.7834
- 2015-05-31 2.9040 2236.5962
- 2015-06-30 2.8697 2162.7784
- 2015-07-31 2.9278 2231.7232
- 2015-08-31 2.8289 2236.4603
- 2015-09-30 3.0368 2163.5916
- 2015-10-31 3.1517 2233.2285
- 2015-11-30 3.0450 2158.6998
- 2015-12-31 2.8261 2228.5550
- 2016-01-31 3.0264 2229.2221
- [13 rows x 2 columns]
- In [110]: df.fillna(method='bfill')
- Out[110]:
- profile profile_monthly
- 2015-01-01 00:00:00 2.8328 2230.9931
- 2015-01-01 01:00:00 3.0607 2230.9931
- 2015-01-01 02:00:00 3.0138 2230.9931
- 2015-01-01 03:00:00 3.0402 2230.9931
- 2015-01-01 04:00:00 3.0335 2230.9931
- 2015-01-01 05:00:00 3.0087 2230.9931
- 2015-01-01 06:00:00 3.0557 2230.9931
- 2015-01-01 07:00:00 2.9280 2230.9931
- 2015-01-01 08:00:00 3.1359 2230.9931
- 2015-01-01 09:00:00 2.9681 2230.9931
- 2015-01-01 10:00:00 3.1240 2230.9931
- 2015-01-01 11:00:00 3.0635 2230.9931
- 2015-01-01 12:00:00 2.9206 2230.9931
- 2015-01-01 13:00:00 3.0714 2230.9931
- 2015-01-01 14:00:00 3.0688 2230.9931
- 2015-01-01 15:00:00 3.0703 2230.9931
- 2015-01-01 16:00:00 2.9102 2230.9931
- 2015-01-01 17:00:00 2.9368 2230.9931
- 2015-01-01 18:00:00 3.0864 2230.9931
- 2015-01-01 19:00:00 3.2124 2230.9931
- 2015-01-01 20:00:00 2.8988 2230.9931
- 2015-01-01 21:00:00 3.0659 2230.9931
- 2015-01-01 22:00:00 2.7973 2230.9931
- 2015-01-01 23:00:00 3.0824 2230.9931
- 2015-01-02 00:00:00 3.0199 2230.9931
- ... ...
- [10000 rows x 2 columns]
- >>> df.fillna(method='bfill')[np.logical_and(df.index.month==12, df.index.day==31)]
- profile profile_monthly
- 2015-12-31 00:00:00 2.926504 2232.288997
- 2015-12-31 01:00:00 3.008543 2234.470731
- 2015-12-31 02:00:00 2.930133 2234.470731
- 2015-12-31 03:00:00 3.078552 2234.470731
- 2015-12-31 04:00:00 3.141578 2234.470731
- 2015-12-31 05:00:00 3.061820 2234.470731
- 2015-12-31 06:00:00 2.981626 2234.470731
- 2015-12-31 07:00:00 3.010749 2234.470731
- 2015-12-31 08:00:00 2.878577 2234.470731
- 2015-12-31 09:00:00 2.915487 2234.470731
- 2015-12-31 10:00:00 3.072721 2234.470731
- 2015-12-31 11:00:00 3.087866 2234.470731
- 2015-12-31 12:00:00 3.089208 2234.470731
- 2015-12-31 13:00:00 2.957047 2234.470731
- 2015-12-31 14:00:00 3.002072 2234.470731
- 2015-12-31 15:00:00 3.106656 2234.470731
- 2015-12-31 16:00:00 3.100891 2234.470731
- 2015-12-31 17:00:00 3.077835 2234.470731
- 2015-12-31 18:00:00 3.032497 2234.470731
- 2015-12-31 19:00:00 2.959838 2234.470731
- 2015-12-31 20:00:00 2.878819 2234.470731
- 2015-12-31 21:00:00 3.041171 2234.470731
- 2015-12-31 22:00:00 3.061970 2234.470731
- 2015-12-31 23:00:00 3.019011 2234.470731
- [24 rows x 2 columns]
- >>> AA = df.groupby((df.index.year, df.index.month)).aggregate(np.mean)
- >>> AA['dev'] = np.random.randn(0,1,len(AA))
- >>> df['dev'] = AA.ix[zip(df.index.year, df.index.month)]['dev'].values
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement