Joining DataFrames with pd.DataFrame.join
While pd.merge is the most common approach for merging two different pd.DataFrame objects, the lesser used yet functionally similar pd.DataFrame.join method is another viable option. Stylistically, you can think of pd.DataFrame.join as a shortcut for when you want to augment an existing pd.DataFrame with a few more columns; by contrast, pd.merge defaults to treating both pd.DataFrame objects with equal importance.
How to do it
To drive home the point about pd.DataFrame.join being a shortcut to augment an existing pd.DataFrame, let’s imagine a sales table where the row index corresponds to a salesperson but uses a surrogate key instead of a natural key:
sales = pd.DataFrame(
[[1000], [2000], [4000]],
columns=["sales"],
index=pd.Index([42, 555, 9000], name="salesperson_id")
)
sales = sales.convert_dtypes(dtype_backend="numpy_nullable")
sales
sales
salesperson_id
42 ...