Denoising Autoencoder for Multiclass Classification

This is a follow up to the question I asked previously a week ago. Thanks to @ptrblck, I followed his advice on following Approach 2 in my question and I am getting better results. However, there still seems to be a few issues.

Here’s just a quick intro: I am training an autoencoder for a multiclass classification problem where I transmit 16 equiprobable messages and send them through a denoising autoencoder to receive them. I am trying to implement the result (modification of Fig. 3b) in this paper, to be specific: Please refer to Fig. 2 in https://arxiv.org/pdf/1702.00832.pdf for the model.

Please have a look at my code here: https://gist.github.com/kirtyvedula/9698a7a728d1484b3bffb0394c1191d7

I have the following questions.The code runs fine, but it is very sensitive to the number of training and test examples I provide, batch size and learning rate. And I am also having problems with running it on a GPU (NVIDIA 1060 Ti). I have a Keras implementation which works perfectly. (I can upload it if necessary). PyTorch gives me weird issues. For example, it gives me “better” results or overcompensates something. Foe example, the errors go down way too low, which is not the scenario at all.

Eb/N0: 0.0 | test BLER: 0.1104
Eb/N0: 0.5 | test BLER: 0.0932
Eb/N0: 1.0 | test BLER: 0.0752
Eb/N0: 1.5 | test BLER: 0.0611
Eb/N0: 2.0 | test BLER: 0.0504
Eb/N0: 2.5 | test BLER: 0.0389
Eb/N0: 3.0 | test BLER: 0.0312
Eb/N0: 3.5 | test BLER: 0.0222
Eb/N0: 4.0 | test BLER: 0.0162
Eb/N0: 4.5 | test BLER: 0.0121
Eb/N0: 5.0 | test BLER: 0.0082
Eb/N0: 5.5 | test BLER: 0.0058
Eb/N0: 6.0 | test BLER: 0.0031
Eb/N0: 6.5 | test BLER: 0.0023
Eb/N0: 7.0 | test BLER: 0.0012
Eb/N0: 7.5 | test BLER: 0.0007
Eb/N0: 8.0 | test BLER: 0.0002
Eb/N0: 8.5 | test BLER: 0.0001
Eb/N0: 9.0 | test BLER: 0.0000
Eb/N0: 9.5 | test BLER: 0.0000
Eb/N0: 10.0 | test BLER: 0.0000
Eb/N0: 10.5 | test BLER: 0.0000

Sorry if these are too many details. Those are just boilerplate digital communications stuff. Please have a look at the general flow of the program and see if I am doing something wrong here.

  1. Do the training and testing loop make sense?
  2. Do the choices of batch size, number of training and test examples
  3. Am I printing the right losses? I checked @ptrblck’s suggestion last time and it was not an overflow problem.

Any ideas would be most welcome. Thank you!

  1. The training loss looks alright. I used the provided shapes to check the input and target and think y = (y.long()).view(-1) should be unnecessary. Could you verify it?
    The test loss might be alright. I’m not too familiar with your use case, but note that you are not applying the normalization. You’ve added the noise addition inside your loop, which should be fine, but I’m not sure, if you need the normalization in your test loop. If so, consider writing it as another class method you could call similar to model.transmitter.

  2. Not sure about it and it should be use case dependent.

  3. Also looks alright.

What kind of issues are you seeing using PyTorch?
Is this regarding using the GPU or some other errors?

1 Like

Thanks a lot @ptrblck! I think normalization might be it. That seems to be the cause for inconsistency.

One other inconsistency is that I am unable to run it on a GPU. I did all the usual checks and made sure my device (Windows, GTX GeForce 1060) is running standard codes. There must be something wrong in here where I am changing the device back to CPU or something like that. I am unable to get it run on a GPU. Could you please have a look at my code and check if I am doing something wrong in terms on .to(device) commands

Your code runs fine on my GPU without any changes.
Do you get any error message or is the utilization at 0%?

Oh, that’s good to hear. It does not give me any errors, but it looks like it isn’t utilizing my GPU to a full extent. It even shows up in nvidia-smi. Am I reading it wrong or does my GPU seem to work? I am concerned because when my Keras implementation of the same algorithm runs on the GPU, it behaves differently.

Based on the output of nvidia-smi it seems your GPU is working.

Regarding the task manager output: there should be a “compute” tab you could activate somehow to see the actual compute workload. I’m not that familiar with Windows, but we had similar issues here in the forum, where the Windows task manager didn’t show the actual GPU usage.

PS: You could speed up your code by:

  • Precompute what seems to be fixed in your model. E.g. the sqrt calculation seems to be fixed to sqrt(7), so you might instead register it as a buffer in your __init__ via self.register_buffer('sqrt', torch.sqrt(torch.tensor(7.))).
  • Use local variables instead of global: self.k instead of k (same for n_channel).
  • Use multiple workers (num_workers > 0) in your DataLoader.

I haven’t profiled the code, but these changes seems to yield a speedup just looking at the outputs.

1 Like

These worked, in addition to putting my code in a dedicated main function and calling it. Thank you so much for your help. However, I am still not able to see the performance on compute_0 and compute_1 tabs on Task Manager. That’s okay for now, I think, since I do some improvement in speed.