Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- s = pd.Series([.4, .5, .6], list('abc'))
- s
- a 0.4
- b 0.5
- c 0.6
- dtype: float64
- pd.Series(np.ones_like(s.values), s.index, name=s.name)
- a 1.0
- b 1.0
- c 1.0
- dtype: float64
- np.random.seed(42)
- df = pd.DataFrame(np.random.randn(10**6,), columns=['A'])
- # Populate values with Nans
- df.loc[df.sample(frac=0.5).index] = np.NaN
- df.shape
- # (1000000, 1)
- def fill_ones_with_modify():
- ser = df['A'].copy(deep=False) # use copy() → without modifying the original DF
- ser.values.fill(1)
- return ser
- %timeit fill_ones_with_modify()
- 1000 loops, best of 3: 837 µs per loop
- def fill_ones_without_modify():
- ser = df[['A']].copy(deep=False).squeeze()
- ser.values.fill(1)
- return ser
- %timeit fill_ones_without_modify()
- 100 loops, best of 3: 6.4 ms per loop
- >>> a = pandas.Series(np.array([0,np.nan,2,3,4]), list('abcde'))
- >>> a
- a 0.0
- b NaN
- c 2.0
- d 3.0
- e 4.0
- dtype: float64
- >>> (a/a).fillna(1)
- a 1.0
- b 1.0
- c 1.0
- d 1.0
- e 1.0
- dtype: float64
- pd.Series(1, s.index, name=s.name)
- s = pd.Series(5, range(int(1e6)))
- %timeit pd.Series(1, s.index, name=s.name)
- %timeit pd.Series(np.ones(s.shape), s.index, name=s.name)
- %timeit fill_ones_with_modify(s)
- %timeit s.div(s).fillna(1)
- 413 µs ± 2.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
- 375 µs ± 2.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
- 369 µs ± 975 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
- 3.47 ms ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement