Why does the .clamp function exist?

I was looking at the example:


and has the line:

# Forward pass: compute predicted y using operations on Variables; these
# are exactly the same operations we used to compute the forward pass using
# Tensors, but we do not need to keep references to intermediate values since
# we are not implementing the backward pass by hand.
y_pred = x.mm(w1).clamp(min=0).mm(w2)

I read the documentation for clamp:

torch.clamp(input, min, max, out=None) → Tensor
Clamp all elements in input into the range [min, max] and return a resulting Tensor.


but it didn’t make sense to me. Why do we need to do such a weird thing? Tensorflow doesn’t “clamp” anything during matrix multiplication why does pytorch?


clamp(min=0) is exactly ReLU.


Oh I see. Thanks. So there is no relu function? we just use clamp?

I was very confused cuz I was just trying to implement linear regression and there was a random clamp command in most examples I’ve seen which was confusing to me.

Of course there is. Relu in PyTorch is torch.nn.functional.relu. So

import torch.nn.functional as F
y_pred = F.relu(x.mm(w1)).mm(w2)

should work in same way.

1 Like

nice… :+1:

Someone would use clamp to clip the rewards in reinforcement learning.


Or to clip the weight values in a Wasserstein GAN


There are relu, and relu6. However what if you want some thing like relu7?

1 Like

I think I have a pretty good grasp of the situations which might lead one to use ReLU6; but I’m kinda interested in it’s history. Got any good links?

1 Like

That’s exactly the same as numpy.clip.


import torch.nn.functional as F
y_pred = F.threshold(x.mm(w1), 0, 7).mm(w2)

is it right?

I don’t think F.threshold can accept min and max.


1 Like

import torch.nn.functional as F
y_pred = F.hardtanh(x.mm(w1), 0, 7).mm(w2)

I found my mistake. Not threshold.

Just use hardtanh!!

In detection, i just found the code use clamp to ensure right poit minus lift point more than 0.

Could also use clamp() to constrain the range of your update tensors in SGD optimization