My bert model raise a StopIteration error when running the code next(self.parameters()).dtype
I first check the model simply by:
print(list(self.parameters()))
# []
# []
# [][]
# []
It mean parameters are empty.
My code is like:
class MyModel(nn.Module)
def __init__(self, hyper_param):
super(MyModel, self).__init__()
self.bert = Model(hyper_param)
def forward(batch):
# check attributes -- (2)
input_ids = batch["input_ids"]
attention_mask = batch["attention_mask"]
cx = batch["cx"]
cy = batch["cy"]
height = batch["height"]
# next(self.parameters()).dtype is in self.bert
output = self.bert(input_ids, attention_mask, cx, cy, height)
return output
model = MyBertModel()
batch = next(dataset_iter)
# check attributes -- (1)
loss = model(batch)
I further check all the protected attributes of the model at position (1) and (2) commented in above code, and get following result:
At position (1)
_backward_hooks = {OrderedDict: 0} OrderedDict()
_buffers = {OrderedDict: 0} OrderedDict()
_forward_hooks = {OrderedDict: 0} OrderedDict()
_forward_pre_hooks = {OrderedDict: 0} OrderedDict()
_is_full_backward_hook = {NoneType} None
_load_state_dict_pre_hooks = {OrderedDict: 0} OrderedDict()
_modules = {OrderedDict: 0} OrderedDict()
_non_persistent_buffers_set = {set: 0} set()
_parameters = {OrderedDict: 1} OrderedDict([('weight', Parameter containing:\ntensor([[-0.0394, 0.0065, 0.0195, ..., -0.0082, -0.0145, 0.0046],\n [ 0.0124, 0.0060, 0.0094, ..., -0.0219, 0.0053, -0.0220],\n [ 0.0086, -0.0022, 0.0252, ..., -0.0312, -0.0307, 0.0214],\n ...,\n [ 0.0200, 0.0115, 0.0103, ..., -0.0153, 0.0163, -0.0371],\n [ 0.0301, -0.0143, 0.0047, ..., -0.0138, -0.0130, 0.0120],\n [-0.0115, 0.0102, -0.0111, ..., -0.0081, - 0.0122, -0.0312]],\n device='cuda:0', requires_grad=True))])
'weight' = {Parameter: (1000, 768)} Parameter containing:\ntensor([[-0.0394, 0.0065, 0.0195, ..., -0.0082, -0.0145, 0.0046],\n [ 0.0124, 0.0060, 0.0094, ..., -0.0219, 0.0053, -0.0220],\n [ 0.0086, -0.0022, 0.0252, ..., -0.0312, -0.0307, 0.0214],\n ...,\n [ 0.0200, 0.0115, 0.0103, ..., - 0.0153, 0.0163, -0.0371],\n [ 0.0301, -0.0143, 0.0047, ..., -0.0138, -0.0130, 0.0120],\n [-0.0115, 0.0102, -0.0111, ..., -0.0081, -0.0122, - 0.0312]],\n device='cuda:0', requires_grad=True)
__len__ = {int} 1
_state_dict_hooks = {OrderedDict: 0} OrderedDict()
_version = {int} 1
At positon (1), parameters seem normal
And At position (2):
_backward_hooks = {OrderedDict: 0} OrderedDict()
_buffers = {OrderedDict: 0} OrderedDict()
_former_parameters = {OrderedDict: 1} OrderedDict([('weight', tensor([[-0.0394, 0.0065, 0.0195, ..., -0.0082, -0.0145, 0.0046],\n [ 0.0124, 0.0060, 0.0094, ..., -0.0219, 0.0053, -0.0220],\n [ 0.0086, -0.0022, 0.0252, ..., -0.0312, -0.0307, 0.0214],\n ...,\n [
'weight' = {Tensor: (1000, 768)} tensor([[-0.0394, 0.0065, 0.0195, ..., -0.0082, -0.0145, 0.0046],\n [ 0.0124, 0.0060, 0.0094, ..., -0.0219, 0.0053, -0.0220],\n [ 0.0086, -0.0022, 0.0252, ..., -0.0312, -0.0307, 0.0214],\n ...,\n [ 0.0200, 0.0115, 0.0103, ..., -0.0153, 0.0163, - 0.0371],\n [ 0.0301, -0.0143, 0.0047, ..., -0.0138, -0.0130, 0.0120],\n [-0.0115, 0.0102, -0.0111, ..., -0.0081, -0.0122, -0.0312]],\n device='cuda:7', grad_fn=<BroadcastBackward>)
__len__ = {int} 1
_forward_hooks = {OrderedDict: 0} OrderedDict()
_forward_pre_hooks = {OrderedDict: 0} OrderedDict()
_is_replica = {bool} True
_is_full_backward_hook = {NoneType} None
_load_state_dict_pre_hooks = {OrderedDict: 0} OrderedDict()
_modules = {OrderedDict: 0} OrderedDict()
_non_persistent_buffers_set = {set: 0} set()
_parameters = {OrderedDict: 0} OrderedDict()
__len__ = {int} 0
_state_dict_hooks = {OrderedDict: 0} OrderedDict()
_version = {int} 1
What I can find is that the _parameter
simply disappears after the call of forward()
method, and with an additional _former_parameters
attribute appear, but I can’t find this attribute in nn.Modules
source code. How to explain this?