问题描述:

I'm trying to replace my NaN value in my DataFrame.

I would like to replace 60% of the NaN by one value and 40% by another.

I read the documentation of fillna method but I don't find.

Any idea ?

Thanks

网友答案:

Create a boolean array that has a chance at 60/40 True/False the same size as the df you're filling. Then use combine_first

import pandas as pd
import numpy as np

df = pd.DataFrame(index=list('ABCDEFGHIJ'), columns=list('abcdefghij'))

mask60 = np.random.rand(*df.shape) > 0.6

value40, value60 = 10, 20

fill = value60 * mask60 + value40 * (1 - mask60)

fill = value40 + mask60 * (value60 - value40)

fill_df = pd.DataFrame(fill, index=df.index, columns=df.columns)

Looks like:

print df.combine_first(fill_df)

    a   b   c   d   e   f   g   h   i   j
A  10  10  20  20  10  10  10  10  10  20
B  10  10  10  10  10  20  20  10  10  10
C  20  10  10  10  10  10  10  20  20  20
D  10  10  10  20  10  10  20  10  10  10
E  20  20  10  10  20  10  10  10  20  10
F  10  20  10  10  20  10  20  10  10  20
G  20  20  10  10  10  10  10  20  20  10
H  10  10  20  20  10  10  10  10  10  10
I  10  10  10  20  20  10  10  10  10  20
J  10  10  10  20  10  10  20  10  10  10
网友答案:

you can do it this way:

df.loc[your_condition_for_60%] = df.fillna(10)
df.loc[your_condition_for_40%] = df.fillna(20)
相关阅读:
Top