Assemble neural networks to improve performance

Hello everybody,
I am approaching the world of Geometric Deep Learning for the first time and I have a question, I hope someone can answer it. I am currently working on models to classify some drugs as highly active, active and inactive (labels: 0, 1, 2) using different neural networks (such as Convolutional Graph Neural Network or Graph Isomorphism Network). The problem is that I can’t get high performance with any model, but I’ve noticed that each network makes different mistakes so I thought I’d put them together and make them work together.
Now the real questions are:

  1. do you believe that assembling networks is a good strategy to increase performance?
  2. would it be better to train networks with different datasets or with the same set before assembling them?

Could you also recommend some assembly techniques, articles or tutorials? Thank you all!