Deep Infomax pytorch


(Duane Nielsen) #1

Hi Pytorch community,

I’ve just coded up a demo of Deep InfoMax. A recently released algorithm from Yoshia Bengio’s research team. https://github.com/DuaneNielsen/DeepInfomaxPytorch

From this paper : Learning deep representations by mutual information estimation and maximization https://arxiv.org/abs/1808.06670

I borrowed heavily from https://github.com/rcalland/deep-INFOMAX, a chainer implementation.

Seems like a promising algorithm to create compressed latent spaces. The basic idea is that you jointly train some loss functions that learn to estimate the amount of mutual information between the latent space and the image. Very cool!

Enjoy!


(Thomas V) #2

Thank you for sharing this! (And kudos for giving credit to the chainer implementation.)

I know I always like the fun pictures best when implementing things, but did you, by any chance, try to reproduce some of their evaluation metric results, e.g. Table 1 and 2 on page seven? (Again, if you didn’t, it’s totally cool, and I’m grateful that you share what you have.)

Best regards

Thomas


(Duane Nielsen) #3

Hi Tom, Thanks for the comments.

I had not tried to reproduce the results yet, but I’ll definitely take a look at that. Thanks for the suggestion!

I want to use this implementation for an AI project I’m working on, so the more testing to validate the code the better!

Cheers

Duane


(Duane Nielsen) #4

Just an update on this, so I ran the classification test on the 64 bit latent space. In the paper that’s the result on Table 1, row DIM(L) (NCE) , column CIFAR 10, Y(64)

I’ll put the results up, but I got a classification accuracy of 60.43, whereas they report 69.13 in the paper.

I’m not able to account for the difference as of yet, as they have not released their code. So it’s possible they are doing something different than what I’m doing in the model.

Having said that, even 60 is a better result than what they report for the other encoding methods. So it’s not all bad news.


(Devon) #5

Hi, we’ve had an implementation up for a while now:

Though the NCE version with the “deep bilinear” model isn’t up yet. We are in the process of updating the repo.

That said, 60% is quite low and even with JSD and the concat-style scoring function should get at least 64-65% with the global vector (iirc) and over 71% for the local feature maps. Give our current repo a try (and the updated version once its finished), and let me know if you are encountering the same problems.


(Duane Nielsen) #6

Thanks, I’ll definitely take a look. Be good to get a baseline and maybe figure out where I’m going wrong.

Thanks!

Duane


(Duane Nielsen) #7

Also, I put a link to your repo on the readme. Hope you don’t mind.


(Devon) #8

Great! First big batch of updates have been made to the repo. Enjoy!