Common Issues#
Pandas support for ExtensionArray is still in development. As a result, there are some common issues that pint-pandas users may encounter.
This page provides some guidance on how to resolve these issues.
Units in Cells#
The most common issue pint-pandas users encouter is that they have a DataFrame with column that aren’t PintArrays.
An obvious indicator is unit strings showing in cells when viewing the DataFrame.
Several pandas operations return numpy arrays of Quantity objects, which can cause this.
In [1]: df
Out[1]:
length
0 2.0 meter
1 3.0 meter
To confirm the DataFrame does not contain PintArrays, check the dtypes.
In [2]: df.dtypes
Out[2]:
length object
dtype: object
Pint-pandas provides an accessor to fix this issue by converting the non PintArray columns to PintArrays.
In [3]: df.pint.convert_object_dtype()
Out[3]:
length
0 2.0
1 3.0
Creating DataFrames from Series#
The default operation of Pandas pd.concat function is to perform row-wise concatenation. When given a list of Series, each of which is backed by a PintArray, this will inefficiently convert all the PintArrays to arrays of object type, concatenate the several series into a DataFrame with that many rows, and then leave it up to you to convert that DataFrame back into column-wise PintArrays. A much more efficient approach is to concatenate Series in a column-wise fashion:
In [4]: list_of_series = [pd.Series([1.0, 2.0], dtype="pint[m]") for i in range(0, 10)]
In [5]: df = pd.concat(list_of_series, axis=1)
In [6]: df
Out[6]:
0 1 2 3 4 5 6 7 8 9
0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0
This will preserve all the PintArrays in each of the Series.