NumPy type system, the object type, and pitfalls
As mentioned back in the introduction to this chapter, at least in the 2.x and 3.x series, pandas still defaults to types that are sub-optimal for general data analysis. You will undoubtedly come across them in code from peer or online snippets, however, so understanding how they work, their pitfalls, and how to avoid them will be important for years to come.
How to do it
Let’s look at the default construction of a pd.Series from a sequence of integers:
pd.Series([0, 1, 2])
0 0
1 1
2 2
dtype: int64
From this argument, pandas gave us back a pd.Series with an int64 data type. That seems normal, so what is the big deal? Well, let’s go ahead and see what happens when you introduce missing values:
pd.Series([0, None, 2])
0 0.0
1 NaN
2 2.0
dtype: float64
Huh? We provided integer data but now we got back a floating point type. Surely specifying the dtype= argument will help...