问题描述:

I came across a strange result while playing around with Pandas and I am not sure why this would work like this. Wondering if it is a bug.

`cf = pd.DataFrame({'sc': ['b' , 'b', 'c' , 'd'], 'nn': [1, 2, 3, 4], 'mvl':[10, 20, 30, 40]})`

df = cf.groupby('sc').mean()

df.loc['b', 'mvl']

This gives "15.0" as result.

`cf1 = cf`

cf1['sc'] = cf1['sc'].astype('category', categories=['b', 'c', 'd'], ordered = True)

df1 = cf1.groupby('sc').mean()

df1.loc['b','mvl']

This gives as result a Series:

`sc`

b 15.0

Name: mvl, dtype: float64

`type(df1.loc['b','mvl'])`

-> `pandas.core.series.Series`

`type(df.loc['b','mvl'])`

-> `numpy.float64`

Why would declaring the variable as categorical change the output of the loc from a scalar to a Series?

I hope it is not a stupid question. Thanks!

This may be a pandas bug. The difference is due to the fact that when you group on a categorical variable, you get a categorical index. You can see it more simply without any groupby:

```
nocat = pandas.Series(['a', 'b', 'c'])
cat = nocat.astype('category', categories=['a', 'b', 'c'], ordered=True)
xno = pandas.Series([8, 88, 888], index=nocat)
xcat = pandas.Series([8, 88, 888], index=cat)
>>> xno.loc['a']
8
>>> xcat.loc['a']
a 8
dtype: int64
```

The docs note that indexing operations on a CategoricalIndex preserve the categorical index. It appears they even do this if you get only one result, which doesn't exactly contradict the docs but seems like undesirable behavior.

There is a related pull request that seems to fix this behavior, but it was only recently merged. It looks like the fix should be in pandas 0.18.1.