Mapping Categorical Data in pandas
In python, unlike R, there is no option to represent categorical data as factors. Factors in R are stored as vectors of integer values and can be labelled.
If we have our data in Series or Data Frames, we can convert these categories to numbers using pandas Series’
astype method and specify ‘categorical’.
Nominal categories are unordered e.g. colours, sex, nationality.
In the example below we categorise the Series
vertebrates of the
df dataframe into their individual categories.
By default the categories are ordered alphabetically, which is why in the example below Amphibian is represented by a zero.