Hi there,
I am writing a PyTorch implementation of Logic Tensor Networks for Semantic Image Interpretation which has opensource Tensorflow code.
I managed to get the network together and it can train. I believe that I am correctly copying the hyperparameters for the optimiser and I also checked that the underlying math is correct. Therefore, I am fairly certain that I have correctly set everything up. However, I have noticed two things that I am struggling to explain:
- The network trains much slower than the Tensorflow implementation. The Tensorflow implementation reaches an average train performance of about 95% within 1000 steps, whereas for my code it requires ~3000 steps.
- The train performance suddenly flips once the performance increases.
I’ll give some more information about each, below.
Slow Network Training
I thought this could be down to hyperparameters for the optimiser. The paper uses RMSProp. I notice that the Tensorflow version has the following hyperparameters:
- learning_rate: A Tensor or a floating point value. The learning rate.
- decay: Discounting factor for the history/coming gradient
- momentum: A scalar tensor.
- epsilon: Small value to avoid zero denominator.
- use_locking: If True use locks for update operation.
- centered: If True, gradients are normalized by the estimated variance of the gradient; if False, by the uncentered second moment. Setting this to True may help with training, but is slightly more expensive in terms of computation and memory. Defaults to False.
- name: Optional name prefix for the operations created when applying gradients. Defaults to “RMSProp”
The PyTorch version has:
- params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
- lr (float, optional) – learning rate (default: 1e-2)
- momentum (float, optional) – momentum factor (default: 0)
- alpha (float, optional) – smoothing constant (default: 0.99)
- eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
- centered (bool, optional) – if True, compute the centered RMSProp, the gradient is normalized by an estimation of its variance
- weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
Following the paper, for the PyTorch RMSProp hyperparameters I use:
- LR = 0.01
- REGULARISATION = 1e-15
- ALPHA = 0.9
- EPSILON = 1e-10
I am assuming that - alpha is the equivalent of the tensorflow decay parameter
- Weight decay is the regularisation, which tensorflow requires to be added externally to the loss
In the paper, the optimiser is initialised here (I can’t hyperlink as I am limited to two links per post): logictensornetworks.py#L21. The regularisation is implemented here: logictensornetworks.py#L53. The relevant hyperparameters are partly defined when the optimiser is initialised and then also here: pascalpart.py#L8
(Note: the hyperparameters get defined a few times in different places, which is confusing, but I checked and the reference I am giving is where the final values are set).
Performance flipping
I use the harmonic mean to calculate average performance, which is very sensitive to lower values. But this still doesn’t explain why performance would suddenly flip. I know that setting a bad learning rate can lead to divergence, but I am seeing steady (but slow) performance increase and then sudden change when performance is high. Moreover, I am copying the paper’s learning rate and have tested their code, which is both fast and doesn’t show the erratic performance behaviour. Here is a trace of the output showing how the performance changes:
15:00:38 [root ] [DEBUG ] : ======Iteration: 1799======
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_bottle Score: 0.992330014705658
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_bottle Score: 0.992296040058136
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_body Score: 0.9187692403793335
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_body Score: 0.9190800189971924
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_cap Score: 0.9014581441879272
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_cap Score: 0.9029223918914795
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pottedplant Score: 0.9162445664405823
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pottedplant Score: 0.9166520237922668
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_plant Score: 0.8137131333351135
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_plant Score: 0.8152325749397278
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pot Score: 0.9676965475082397
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pot Score: 0.9682645201683044
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_tvmonitor Score: 0.8954532146453857
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_tvmonitor Score: 0.895622968673706
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_screen Score: 0.927297055721283
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_screen Score: 0.9281954765319824
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_chair Score: 0.9460874795913696
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_chair Score: 0.9469890594482422
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_sofa Score: 0.8196842074394226
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_sofa Score: 0.820656418800354
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_isOfType_diningtable Score: 0.889803409576416
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_diningtable Score: 0.8908553123474121
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_partOf Score: 0.9138585925102234
15:00:38 [root ] [DEBUG ] : ==Clause: Clause_neg_partOf Score: 0.9138901829719543
15:00:38 [root ] [DEBUG ] : ==Score: 0.9060626029968262==
15:00:38 [root ] [DEBUG ] : ===Setting up data subsets===
15:00:39 [root ] [DEBUG ] : ======Iteration: 1800======
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_bottle Score: 0.990585446357727
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_bottle Score: 0.9916159510612488
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_body Score: 0.8670791387557983
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_body Score: 0.7959716320037842
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_cap Score: 0.8995224833488464
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_cap Score: 0.8619083166122437
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pottedplant Score: 0.9155776500701904
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pottedplant Score: 0.7972513437271118
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_plant Score: 0.8212026357650757
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_plant Score: 0.7968321442604065
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pot Score: 0.9789061546325684
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pot Score: 0.981188952922821
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_tvmonitor Score: 0.8427695631980896
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_tvmonitor Score: 0.7958213090896606
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_screen Score: 0.7628504037857056
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_screen Score: 0.013370582833886147
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_chair Score: 0.941319465637207
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_chair Score: 0.11598806828260422
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_sofa Score: 0.7932454943656921
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_sofa Score: 0.8461207747459412
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_diningtable Score: 0.8798862099647522
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_diningtable Score: 0.8369185924530029
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_partOf Score: 0.8789774179458618
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_partOf Score: 0.7778722643852234
15:00:39 [root ] [DEBUG ] : ==Score: 0.22021272778511047==
15:00:39 [root ] [DEBUG ] : ======Iteration: 1801======
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_bottle Score: 0.9906085729598999
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_bottle Score: 0.9916187524795532
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_body Score: 0.8675155639648438
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_body Score: 0.7966839671134949
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_cap Score: 0.8989664316177368
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_cap Score: 0.8625910878181458
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pottedplant Score: 0.9151699542999268
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pottedplant Score: 0.797529935836792
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_plant Score: 0.8211848735809326
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_plant Score: 0.7974907159805298
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pot Score: 0.9789376258850098
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pot Score: 0.9812048077583313
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_tvmonitor Score: 0.8427335023880005
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_tvmonitor Score: 0.7968791127204895
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_screen Score: 0.0010666713351383805
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_screen Score: 0.9761099815368652
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_chair Score: 0.9172378778457642
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_chair Score: 0.22136850655078888
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_sofa Score: 0.7935158014297485
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_sofa Score: 0.8462190628051758
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_diningtable Score: 0.8791512846946716
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_diningtable Score: 0.8383476138114929
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_partOf Score: 0.8790765404701233
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_partOf Score: 0.7791911959648132
15:00:39 [root ] [DEBUG ] : ==Score: 0.02481084130704403==
15:00:39 [root ] [DEBUG ] : ======Iteration: 1802======
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_bottle Score: 0.9906294345855713
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_bottle Score: 0.991621196269989
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_body Score: 0.867911159992218
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_body Score: 0.797331690788269
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_cap Score: 0.8984541893005371
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_cap Score: 0.8632128238677979
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pottedplant Score: 0.9147946834564209
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pottedplant Score: 0.797791600227356
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_plant Score: 0.8211624026298523
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_plant Score: 0.798086941242218
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_pot Score: 0.9789655208587646
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_pot Score: 0.9812188744544983
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_tvmonitor Score: 0.8426967859268188
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_tvmonitor Score: 0.7978458404541016
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_screen Score: 9.303313163400162e-06
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_screen Score: 0.9993413686752319
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_chair Score: 0.8807621598243713
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_chair Score: 0.37268978357315063
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_sofa Score: 0.7937597632408142
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_sofa Score: 0.846305787563324
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_isOfType_diningtable Score: 0.8784731030464172
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_isOfType_diningtable Score: 0.8396425843238831
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_partOf Score: 0.8791543841362
15:00:39 [root ] [DEBUG ] : ==Clause: Clause_neg_partOf Score: 0.7804265022277832
15:00:39 [root ] [DEBUG ] : ==Score: 0.00022322138829622418==
The score then continues to fall to ~0.
Each of the scores in the trace is a predicate which is calculated via the below code. Then, following how the paper does it, the harmonic mean of the batch of outputs is taken. The final score is another harmonic mean taken on all predicate scores.
def compute(self, inference, input_):
"""
Compute predicate grounding for input_
"""
stacked_inputs = input_
batch_h = torch.bmm(
torch.einsum('bi,ijk->bkj', (stacked_inputs, self.W)),
stacked_inputs.unsqueeze(-1)
).squeeze(-1)
mx_plus_b = torch.matmul(
stacked_inputs, self.V) + self.B # Broadcast on self.B
non_linear = self.tanh(batch_h + mx_plus_b)
output = self.sigmoid(
torch.matmul(non_linear, self.U))
return output
I’ve been working on this for a while and am quite confused. Any help would be great and if any more info is needed let me know.
Thanks!