概要
- pandas.DataFrameで複数条件を指定したときによく出るエラーの対処方法
df[df['var1'] >= 0 and df['var2'] <= 0.5]
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
解決方法
- 各条件を括弧でくくる
- and/orではなく記号(&や|)で接続する
df[(df['var1'] >= 0) & (df['var2'] <= 0.5)]
詳細
準備
import numpy as np
import pandas as pd
cols = ['var1', 'var2', 'var3', 'var4']
df = pd.DataFrame(np.random.randn(4, 4), columns=cols)
df
var1 var2 var3 var4
0 0.597118 -0.853204 1.813645 0.694750
1 -1.118426 0.011119 -2.161933 0.792262
2 0.665828 0.384975 1.676278 -0.487037
3 -0.216118 2.084042 -0.279242 -1.785128
エラーその1
df[df['var1'] >= 0 and df['var2'] <= 0.5]
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
エラーその2
df[(df['var1'] >= 0) and (df['var2'] <= 0.5)]
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
エラーその3
df[df['var1'] >= 0 & df['var2'] <= 0.5]
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
正常
df[(df['var1'] >= 0) & (df['var2'] <= 0.5)]
var1 var2 var3 var4
0 0.597118 -0.853204 1.813645 0.694750
2 0.665828 0.384975 1.676278 -0.487037