# How to prevent inf while working with exponential

Hi there,

I’m trying to create a function in a network with trainable parameters. In my function I have an exponential that for large tensor values goes to infinity. What would the best way to avoid this be?

The function is as follows:

``````step1 = Pss-(k*Pvv)
step2 = step1*s
step3 = torch.exp(step2)
step4 = torch.log10(1+step3)
step5 = step4/s

#or equivalently
# train_curve = torch.log(1+torch.exp((Pss-k*Pvv)*s))/s
``````

If it makes it easier to understand, the basic function is log10(1+e^(x-const)*10)/10. The exponential inside the log gets too big and goes to inf.

I think I might have to normalize my tensor x, and this would mean normalizing the constants and the rest of the function also. Would someone have any thoughts on the best way to go about this?

Thanks so much.

Hi Clement!

(Also, of course, although related, `log10()` is not the same as
`log()` (“natural log,” log base-e").

Best.

K. Frank

I tried this but couldn’t get it to work for my application. I’m going to give it another go, and possibly modify the source code to use base 10. Should it be as simple as replacing torch.log with `torch.` `logsumexp()`?

Hello Clement!

Please feel free to post a short, complete, runnable script
that has the original code you posted (that fails when the
tensor values become to large), together with your improved
attempt (using `logsumexp()` or whatever). If you show us
what fails or what the discrepancy is, someone here will

In isolation, such a direct replacement won’t work.

First, note that `log10 (x) = log (x) / log (10)`.

Second, you wish to calculate
`torch.log (1 + torch.exp (step2))`. Because
`torch.exp (0) = 1`, this can be written as
`torch.log (torch.exp (0) + torch.exp (step2))`,
for which you can use `torch.logsumexp()`.

Since you are working with tensors, I imagine that you would
add a new dimension of length 2 to your tensor. Along this
dimension, the first element would be that of your original
tensor, and the second element would be 0. You would then
apply `torch.logsumexp()` along the new dimension, e.g.,
`torch.logsumexp (my_new_tensor, dim = 0)`, where
the extra new dimension added to create `my_new_tensor` is,
in this example, `dim = 0`.

Good luck!

K. Frank

Hi K. Frank,

Thank you very much for your response. I’ve actually learned a bit from some of your other responses on the forum in previous work. I’ll give some more context to what I’m doing and more code. You can save time by skipping to the end to see what I’ve attempted. I think it could be too much to share everything needed to run a test of this module as an image and other modules are involved.

I’m training an autoencoder to denoise an image by

1. Using the autoencoder/UNET to output a blurry mask/residual image.
2. Subtracting this mask from the original image to get a noisy texture map.
3. Filtering this noisy texture map in the frequency domain with a Wiener filter.
4. Adding the output of (3) back onto the output of the autoencoder to restore detail to the final image. Basically, a two-part denoiser where the network implicitly learns to creates a mask over some parts of the image and lets the wiener filter take over.

I’m trying to make the Wiener filter learn a mapping for attenuating certain frequency magnitudes, instead of just making it a function of noise variance` Pvv` to signal power spectrum `Pss`.

Here is what I had before, where H is the filter vector to be applied to each value in the frequency block tensor:

``````            freq_block = torch.rfft(win_data_block, win_data_block.ndim, onesided=False)
Pss = torch.abs(freq_block)**2
Pss = torch.sum(Pss, 2)
Pss = Pss.double()
H = torch.max((Pss-Pvv), torch.zeros(Pss.size(), dtype=torch.double)) / Pss

H = H.unsqueeze(2).repeat(1, 1, 2)

filt_freq_block = H*freq_block
``````

This is comparing the power spectrum magnitude of the signal to the noise power and penalises the signal for a high noise. Pss represents a 16x16 window (lets say between 0 and 30) and Pvv is the constant noise power (lets say 1.5). I want to make a new frequency coring function based on a function with trainable parameters Here is the filter I am working with, with the problem we are discussing. Most of the code is dealing with indexing and figuring out boundaries so I’ll put the part I’m struggling with in bold:

``````

import numpy as np
import torch

noise = torch.randn(img.size()) * (std)
noisy_image = img + noise
return noisy_image

def wiener_3d(I, noise_std, block_size,k,s):

width = I.shape[1]
height = I.shape[0]
IR = torch.zeros(height, width, dtype=torch.float64)
# if(len(list(I.shape)) >= 3):
#    frames = I.shape[2]
# else:
#   bt = 1
bt = 1
bx = block_size
by = block_size

hbx = bx/2
hby = by/2
hbt = bt/2

sx = (width + hbx - 1)/hbx
sy = (height + hby - 1)/hby

win = torch.ones(by, bx, bt)
win1x = torch.cos((torch.arange(-hbx + .5, hbx - .5 + 1)/bx) * np.pi)
win1y = torch.cos((torch.arange(-hby + .5, hby - .5 + 1)/by) * np.pi)
win1t = torch.cos((torch.arange(-hbt + .5, hbt - .5 + 1)/bt) * np.pi)

for x in range(bx):
for y in range(by):
for t in range(bt):
win[y, x, t] = win1y[y]*win1x[x]*win1t[t]

if(bt == 1):
win = torch.squeeze(win)

Pvv = torch.mean(torch.pow(win, 2))*torch.numel(win)*(noise_std**2)
Pvv = Pvv.double()
bx0 = torch.range(0, bx-1)
by0 = torch.range(0, by-1)

for x in range(0, int((hbx*sx)), int(hbx)):
for y in range(0, int((hby*sy)), int(hby)):
# print(x,y)
tx = np.arange(x-hbx+1, x+hbx+1)
validx = np.arange(np.maximum(-tx[0], 0), bx - np.maximum((tx[-1]-width+1), 0))
cx = np.minimum(np.maximum(tx, 0), width-1)
validx = validx.astype(int)
rcx = torch.as_tensor(tx[validx], dtype=torch.long)
bcx = torch.as_tensor(bx0[validx], dtype=torch.long)

ty = np.arange(y-hby+1, y+hby+1)
validy = np.arange(np.maximum(-ty[0], 0), by - np.maximum((ty[-1]-width+1), 0))
cy = np.minimum(np.maximum(ty, 0), width-1)
validy = validy.astype(int)
rcy = torch.as_tensor(ty[validy], dtype=torch.long)
bcy = torch.as_tensor(by0[validy], dtype=torch.long)

cy = torch.as_tensor(cy, dtype=torch.long)
cx = torch.as_tensor(cx, dtype=torch.long)
data_block = torch.index_select(I, 0, cy)
data_block = torch.index_select(data_block, 1, cx)

mean_block = torch.mean(data_block)
win_data_block = (data_block - mean_block)*win

freq_block = torch.rfft(win_data_block, win_data_block.ndim, onesided=False)
Pss = torch.abs(freq_block)**2
Pss = torch.sum(Pss, 2)
Pss = Pss.double()

step1 = Pss-(k*Pvv)
step2 = step1*s
step3 = torch.exp(step2)
zeros = torch.zeros_like(step3)
step4 = torch.logsumexp([step3, zeros], 0)
step5 = step4/s

# train_curve = torch.log(1+torch.exp((Pss-k*Pvv)*s))/s
H = torch.max(step5, torch.zeros(Pss.size(), dtype=torch.double))
#H = H / normalised_Pss
# H[H != H] = 0
H = H.unsqueeze(2).repeat(1, 1, 2)

filt_freq_block = H*freq_block
filt_data_block = torch.irfft(filt_freq_block, win_data_block.ndim, onesided=False)
filt_data_block = (filt_data_block + mean_block*win) * win
# hbt = torch.round(hbt)

filt_data_block = torch.index_select(filt_data_block, 0, bcy)
filt_data_block = torch.index_select(filt_data_block, 1, bcx)
IR[rcy[0]:rcy[-1] + 1, rcx[0]:rcx[-1] + 1] = IR[rcy[0]:rcy[-1] + 1, rcx[0]:rcx[-1] + 1] + filt_data_block
return IR

``````

and again the part I’m struggling with

``````
step1 = Pss-(k*Pvv)
step2 = step1*s
step3 = torch.exp(step2)
zeros = torch.zeros_like(step3)
step4 = torch.logsumexp([step3, zeros], 0)
step5 = step4/s

# train_curve = torch.log(1+torch.exp((Pss-k*Pvv)*s))/s
H = torch.max(step5, torch.zeros(Pss.size(), dtype=torch.double))
H = H.unsqueeze(2).repeat(1, 1, 2)

filt_freq_block = H*freq_block
``````

I’m not calling the function properly but I thought I’d show you what I’ve done so far. I don’t 100% understand what you mean when you say add a new dimension of length 2.

Thanks a million,
Clément

Hello Clement!

I’m suggesting that you post a test script just for your log-exp
calculation.

Something like:

``````import torch

sample_tensor =  something_like_torch_randn()

original_result = torch.log (1 + torch.exp (sample_tensor))

new_result = something_with_torch_logsumexp (sample_tensor)
``````

This should be a complete, runnable script (that is simple,
totally self-contained, and doesn’t depend on any of your
other modules or outside data). If your `new_result` doesn’t
run, we can see the errors and make suggestions. If it does
run, but doesn’t give the right answer, we can look for the
cause of the discrepancy.

(Your `original_result` will be our baseline for comparison.
It should be correct for moderate tensor values. It’s okay if it
becomes `inf` for large tensor values because that’s the issue