Pandas - convert list to integer

Hope someone can help with this.

I have a column of data that is in a pandas dataframe.

I need to convert the strings in this column to integers - but since the strings repeat I want a unique number assigned to each string e.g. a banana should = 0, an apple = 1 etc. However, I do not want to specify the assignment since there are may be up to 200 unique strings in a dataset of 175,000.

Column Desired result would be automatic allocation of numbers to each string name
banana replaced by a 0
apple replaced by a 1
pear replaced by a 2
banana replaced by a 0
grape replaced by a 3
apple replaced by a 1
pear replaced by a 4 etc.

So its automatic assignment of a number to each string - without needing to know in advance how many strings/numbers are in the column.

Thanks in advance.

Hi Gerry,
Use pandas.factorize( to convert desired column into unique integers.

Thanks for replying.

I think I have found that the answer lies in the following code for in-situ updates. I have tried it and it seems to work.

#protos is the column name

protos = df.proto.unique()

proto_dict = dict(zip(protos, range(len(protos))))

df=df.applymap(lambda s: proto_dict.get(s) if s in proto_dict else s)

Thanks again for taking the time to reply.