Graph Classification via Random Forest

Gianmarco_Luchetti_S · February 8, 2022, 4:59pm

Hello everybody!
I’m a medicinal chemistry undergraduate student who is preparing his dissertation. My idea would be to create a classifier that can distinguish anticancer drugs as active or inactive and distinguish those active in three classes, describing the molecules as a graph. My supervisor suggested me to use the random forest classifier, and to do this I need to convert my graph into a vector trying to keep as many characteristics as possible.

I start from a molecular graph dataset like this:

Data(x=[9, 9], edge_index=[2, 18], edge_attr=[18, 2], y=[0], smiles='COC(=O)C=CN1CC1')

where x, edge_index and edge_attr are a torch tensor, and y is the label (0 is inactive, 1 is activity of class one, 2 is activity of class two, …). To run a random forest classifier I think I must convert them into a vector like a np.array, to do that I think I must to compute a kernel for my graphs, but I have no idea how to do it . Has anyone had experience on this task?

anantguptadbl · February 8, 2022, 7:17pm

@Gianmarco_Luchetti_S You should refer to this paper for your project

I have tried it and this will give you ideas on how to approach your problem