Loading the Trainable activation function

Siva · October 27, 2022, 12:45am

Hi all,
I am using an adaptive activation function to train a neural network defined in a class module. I defined the activation function in a (parent) class and called the parent class (activation function with trainable parameters) in a child class (where the neural layers are defined). I could optimize the activation function and reduce the loss function to the desired value. I save and load the model by using the state_dict method.

Activation function with trainable parameters:
{Here, ‘n’ is a constant scaling factor, and ‘a’ is a trainable parameter}
class My_SiLU(nn.Module) # Swish Function
def init(self, n, a):
super().init()
self.n = n
self.a = a
def forward(self, x):
output = nn.SiLU()
return output(self.aself.nx)

Neural network:
class NN(nn.Module):
def init(self):
super().init()
self.main = nn.Sequential(
nn.Linear(input, neurons),

		        My_SiLU(n, a),
			nn.Linear(neurons,neurons),

			My_SiLU(n, a),
			nn.Linear(neurons,neurons),

			nn.Linear(neurons,1),
		)

	def forward(self, x):
		output = self.main(x)
		return output

model = NN()
ac_func=My_SiLU(n, a)
After some epochs and optimization, I would save the network by using model.state_dict() and the function by ac_func.state_dict().

In another program, when I try to load both state.dict() files, it throws an error that
“in load_state_dict
raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
RuntimeError: Error(s) in loading state_dict for NN:
Unexpected key(s) in state_dict: “main.1.a”, “main.3.a”,”

Could anyone suggest me to fix this?
Thanks in advance!

ptrblck · October 27, 2022, 5:43am

Your code is not formatted properly so quite hard to read.
However, based on the error message it seems your model definition in the new script differs from the old one and the state_dict has additional trainable parameters called main.1.a and main.3.a, which seem to come from the My_SiLU modules in the nn.Sequential container.