This is related to my previous question long time ago.

I would like to count nan from the list, but it is not string but real nan such as.

``b = [1.0, nan, nan, 3.5 ...]``

From this list, I would like to count the lenth of continuous nan. In the case above the number would be 2.

My code was:

`` v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(key)]``

In this case the result of v turns out empty.

Also when I changed code to:

`` v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(b)]``

The error occurs as ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

Would you please give me any idea or help.

I will really appreciate any idea or help.

Isaac

``````v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(key)]
``````

You're comparing `key`, which is an element of the list, to `np.isnan(key)`, which is a boolean. Moreover, since `nan != nan`, this might not even group neighbouring nans together.

``````v = [len(list(group)) for key, group in groupby(b) if key== np.isnan(b)]
``````

Now you're comparing `key`, which is an element of the list, to an entire boolean numpy array. That isn't what you want to do either, and `numpy` is quite reasonably telling you that there's no canonical way for it to know what you want `bool(key == np.isnan(b))` to do, so it can't figure out whether or not to take the `if`.

Maybe something like

``````>>> b = np.array([1, np.nan, np.nan, 2, 3, np.nan, 4])
>>> v = [len(list(group)) for key, group in groupby(b, key=np.isnan) if key]
>>> v
[2, 1]
``````

would work. The individual parts will look something like

``````>>> vv = [(key, list(group)) for key, group in groupby(b, key=np.isnan)]
>>> vv
[(False, [1.0]), (True, [nan, nan]), (False, [2.0, 3.0]), (True, [nan]), (False, [4.0])]
``````

(With a bit more thought you could probably get a vectorized numpy approach to work too, but let's start with the tools you're familiar with.)

As @user2357112 notes in the comments, since we only care about the length of the nan clusters, we can optimize this by doing the `isnan` check all in one go:

``````>>> b
array([  1.,  nan,  nan,   2.,   3.,  nan,   4.])
>>> np.isnan(b)
array([False,  True,  True, False, False,  True, False], dtype=bool)
>>> [len(list(g)) for k,g in groupby(np.isnan(b)) if k]
[2, 1]
``````

You can do something like the following:

``````>>> from numpy import nan
>>> from itertools import groupby
>>> x = [1.0, nan, nan, 3.5, nan, nan, nan]
>>> [item[1] for item in [(c,len(list(cgen))) for c,cgen in groupby(x)] if item[0] is nan]
[2, 3]
>>>
``````

This uses `groupby` and then gets all the occurrences.

Top