问题描述:

I have a Pandas df like this:

color start end

red 01/01/1980 31/12/1982

blue 01/01/1983 31/12/1988

blue 01/01/1989 31/12/1995

red 01/01/1996 31/12/1997

blue 01/01/1998 31/12/1999

red 01/01/2000 31/12/2004

How do I transform the date intervals into an index, keeping only the year? Like this:

1980 red

1981 red

1982 red

1983 blue

1984 blue

.

.

网友答案:

Using set_index and reindex, and ffill forward filling missing values, you can get

In [319]: dff = df.set_index(pd.to_datetime(df['start']).dt.year)['color']

In [320]: dff
Out[320]:
start
1980     red
1983    blue
1989    blue
1996     red
1998    blue
2000     red
Name: color, dtype: object

Then reindex from date ranges and forward ffill missing values.

In [321]: dff.reindex(range(dff.index.min(), dff.index.max()+1)).ffill()
Out[321]:
start
1980     red
1981     red
1982     red
1983    blue
1984    blue
1985    blue
1986    blue
1987    blue
1988    blue
1989    blue
1990    blue
1991    blue
1992    blue
1993    blue
1994    blue
1995    blue
1996     red
1997     red
1998    blue
1999    blue
2000     red
Name: color, dtype: object
网友答案:

Make sure your your date columns are datetime objects (you can convert them if not using pd.to_datetime(df['Date']). Otherwise it's simply:

df['Year'] = df['Date'].dt.year
df2 = df.set_index(['Year'])
相关阅读:
Top