astype errors with Categorical

import pandas as pd
cat = pd.Categorical(['CA', 'AL'])
df = pd.DataFrame([['CA', 'CA'], ['AL', 'CA']], index=['foo', 'bar'], columns=range(2))
df[1].astype(cat)

with 0.20.1, this raises:

    if dtype == CategoricalDtype():
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

This doesn’t appear to be quite the intended usage. .astype("category", categories=cat) also fails, though .astype("category", categories=cat.categories) is OK.

I suspect this is related to similar errors in trying to identify which columns of a DataFrame are categorical (possible repeat of #16659):

df.dtypes[colname] == 'category' evaluates as True for categorical columns and raises TypeError: data type "category" not understood for np.float64 columns.

df.dtypes == pd.Categorical raises TypeError: Could not compare <type 'type'> type with Series

Also related: #15078

Author: Fantashit

1 thought on “astype errors with Categorical

Comments are closed.