Selection/filtering via Boolean arrays
Using Boolean lists/arrays (also referred to as masks) is a very common method to select a subset of rows.
How to do it
Let’s create a mask of True=/=False values alongside a simple pd.Series:
mask = [True, False, True]
ser = pd.Series(range(3))
ser
0 0
1 1
2 2
dtype: int64
Using the mask as an argument to pd.Series[] will return each row where the corresponding mask entry is True:
ser[mask]
0 0
2 2
dtype: int64
pd.Series.loc will match the exact same behavior as pd.Series[] in this particular case:
ser.loc[mask]
0 0
2 2
dtype: int64
Interestingly, whereas pd.DataFrame[] usually tries to select from the columns when provided a list argument, its behavior with a sequence of Boolean values is different. Using the mask we have already created, df[mask] will actually match along the rows rather than the columns:
df = pd.DataFrame(np.arange(6).reshape(3, -1))
df[mask...