Initializing data#
There are several ways to initialize a PintArray in a DataFrame. Here’s the most common methods.
In [1]: df = pd.DataFrame(
...: {
...: "Ser1": pd.Series([1, 2], dtype="pint[m]"),
...: "Ser2": pd.Series([1, 2]).astype("pint[m]"),
...: "Ser3": pd.Series([1, 2], dtype="pint[m][Int64]"),
...: "Ser4": pd.Series([1, 2]).astype("pint[m][Int64]"),
...: "PArr1": PintArray([1, 2], dtype="pint[m]"),
...: "PArr2": PintArray([1, 2], dtype="pint[m][Int64]"),
...: "PArr3": PintArray([1, 2], dtype="m"),
...: "PArr4": PintArray([1, 2], dtype=ureg.m),
...: "PArr5": PintArray(Quantity([1, 2], ureg.m)),
...: "PArr6": PintArray([1, 2],"m"),
...: }
...: )
...:
In [2]: df
Out[2]:
Ser1 Ser2 Ser3 Ser4 PArr1 PArr2 PArr3 PArr4 PArr5 PArr6
0 1.0 1.0 1 1 1 1 1 1 1 1
1 2.0 2.0 2 2 2 2 2 2 2 2
In the first two Series examples above, the data was converted to Float64.
In [3]: df.dtypes
Out[3]:
Ser1 pint[meter][Float64]
Ser2 pint[meter][Float64]
Ser3 pint[meter][Int64]
Ser4 pint[meter][Int64]
PArr1 pint[meter][Int64]
PArr2 pint[meter][Int64]
PArr3 pint[meter][Int64]
PArr4 pint[meter][Int64]
PArr5 pint[meter][Int64]
PArr6 pint[meter][Int64]
dtype: object
To avoid this conversion, specify the subdtype (dtype of the magnitudes) in the dtype "pint[m][Int64]" when constructing using a Series. The default data dtype that pint-pandas converts to can be changed by modifying pint_pandas.pint_array.DEFAULT_SUBDTYPE.
PintArray infers the subdtype from the data passed into it when there is no subdtype specified in the dtype. It also accepts a pint Unit or unit string as the dtype.
Note
"pint[unit]" or "pint[unit][subdtype]" must be used for the Series or DataFrame constuctor.
Non-native pandas dtypes#
PintArray uses an ExtensionArray to hold its data inclluding those from other libraries that extend pandas.
For example, an UncertaintyArray can be used.
In [4]: from uncertainties_pandas import UncertaintyArray, UncertaintyDtype
In [5]: from uncertainties import ufloat, umath, unumpy
In [6]: ufloats = [ufloat(i, abs(i) / 100) for i in [4.0, np.nan, -5.0]]
In [7]: uarr = UncertaintyArray(ufloats)
In [8]: uarr
Out[8]:
<UncertaintyArray>
[4.0+/-0.04, <NA>, -5.0+/-0.05]
Length: 3, dtype: UncertaintyDtype
In [9]: PintArray(uarr,"m")
Out[9]:
<PintArray>
[4.00+/-0.04, <NA>, -5.00+/-0.05]
Length: 3, dtype: pint[meter][UncertaintyDtype]
In [10]: pd.Series(PintArray(uarr,"m")*2)
Out[10]:
0 8.00+/-0.08
1 nan
2 -10.00+/-0.10
dtype: pint[meter][UncertaintyDtype]