问题描述:

This is related to my previous question long time ago.

I would like to count nan from the list, but it is not string but real nan such as.

b = [1.0, nan, nan, 3.5 ...]

From this list, I would like to count the lenth of continuous nan. In the case above the number would be 2.

My code was:

 v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(key)]

In this case the result of v turns out empty.

Also when I changed code to:

 v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(b)]

The error occurs as ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

Would you please give me any idea or help.

I will really appreciate any idea or help.

Isaac

网友答案:
v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(key)]

You're comparing key, which is an element of the list, to np.isnan(key), which is a boolean. Moreover, since nan != nan, this might not even group neighbouring nans together.

v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(b)]

Now you're comparing key, which is an element of the list, to an entire boolean numpy array. That isn't what you want to do either, and numpy is quite reasonably telling you that there's no canonical way for it to know what you want bool(key == np.isnan(b)) to do, so it can't figure out whether or not to take the if.

Maybe something like

>>> b = np.array([1, np.nan, np.nan, 2, 3, np.nan, 4])
>>> v = [len(list(group)) for key, group in groupby(b, key=np.isnan) if key]
>>> v
[2, 1]

would work. The individual parts will look something like

>>> vv = [(key, list(group)) for key, group in groupby(b, key=np.isnan)]
>>> vv
[(False, [1.0]), (True, [nan, nan]), (False, [2.0, 3.0]), (True, [nan]), (False, [4.0])]

(With a bit more thought you could probably get a vectorized numpy approach to work too, but let's start with the tools you're familiar with.)


As @user2357112 notes in the comments, since we only care about the length of the nan clusters, we can optimize this by doing the isnan check all in one go:

>>> b
array([  1.,  nan,  nan,   2.,   3.,  nan,   4.])
>>> np.isnan(b)
array([False,  True,  True, False, False,  True, False], dtype=bool)
>>> [len(list(g)) for k,g in groupby(np.isnan(b)) if k]
[2, 1]
网友答案:

You can do something like the following:

>>> from numpy import nan
>>> from itertools import groupby
>>> x = [1.0, nan, nan, 3.5, nan, nan, nan]
>>> [item[1] for item in [(c,len(list(cgen))) for c,cgen in groupby(x)] if item[0] is nan]
[2, 3]
>>> 

This uses groupby and then gets all the occurrences.

相关阅读:
Top