• Register
1 vote
2.5k views

Problem :

I am receiveing error as
the truth value of a dataframe is ambiguous. use a.empty, a.bool(), a.item(), a.any() or a.all().
6.9k points

Please log in or register to answer this question.

2 Answers

0 votes

Solution :

As far as i know pandas use bitwise '&' '|' and each condition should be wrapped in a '()'

For e.g. following works

data_query = data[(data['year'] >= 2005) & (data['year'] <= 2010)]

But the same query without proper brackets will not work

data_query = data[(data['year'] >= 2005 & data['year'] <= 2010)]
So try above steps to get rid of your issue.
38.6k points
even after applying this solution my issue is not getting resolved.
--------- this is my function-----------
def _transform_to_long_format(df):
    res = []
    dates = pd.to_datetime(df['data_date'])
    filt = (dates <= '2000-01-01') | (dates >= current_Date)
    for col in INDEX_COLS:
        # date values are in same order that rows in the dataframe - enumeration provides correct row indexes
        for i,d in enumerate(dates.values):
            if dates.loc[filt]:
                raise ValueError ("Date {} are not correct".format(k))
            else:
                res.append([d.split(" ")[0], col, df.loc[i,col]])
    return pd.DataFrame(res, columns=["data_date", "index_code", "value"])


It is not going to IF statement where I put my filter condition which is true then simply raise Error.
Any suggestion?
0 votes

Solution:

The or and and python statements require truth-values. For pandas these are considered ambiguous so you should use "bitwise" | (or) or & (and) operations:

result = result[(result['var']>0.25) | (result['var']<-0.25)]
These are overloaded for these kind of datastructures to yield the element-wise or (or and).

Just to add some more explanation to this statement:

The exception is thrown when you want to get the bool of a pandas.Series:

>>> import pandas as pd
>>> x = pd.Series([1])
>>> bool(x)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What you hit was a place where the operator implicitly converted the operands to bool (you used or but it also happens for andif and while):

>>> x or x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> x and x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> if x:
...     print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> while x:
...     print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Besides these 4 statements there are several python functions that hide some bool calls (like anyallfilter, ...) these are normally not problematic with pandas.Series but for completeness I wanted to mention these.

As user2357112 mentioned in the comments, you cannot use chained comparisons here. For elementwise comparison you need to use &. That also requires using parentheses so that & wouldn't take precedence.

It would go something like this:

mask = ((50  < df['heart rate']) & (101 > df['heart rate']) & (140 < df['systolic...

In order to avoid that, you can build series for lower and upper limits:

low_limit = pd.Series([90, 50, 95, 11, 140, 35], index=df.columns)
high_limit = pd.Series([160, 101, 100, 19, 160, 39], index=df.columns)

Now you can slice it as follows:

mask = ((df < high_limit) & (df > low_limit)).all(axis=1)
df[mask]
Out: 
     dyastolic blood pressure  heart rate  pulse oximetry  respiratory rate  \
17                        136          62              97                15   
69                        110          85              96                18   
72                        105          85              97                16   
161                       126          57              99                16   
286                       127          84              99                12   
435                        92          67              96                13   
499                       110          66              97                15   

     systolic blood pressure  temperature  
17                       141           37  
69                       155           38  
72                       154           36  
161                      153           36  
286                      156           37  
435                      155           36  
499                      149           36  

And for assignment you can use np.where:

df['class'] = np.where(mask, 'excellent', 'critical')

 

31.7k points