Batch normalisation value overflow

nn.BatchNorm1d() throws error
lib\site-packages\torch\nn\functional.py", line 2450, in batch_norm return torch.batch_norm( RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
when using batch size of 8 or 16 or so, but worked fine when using 2 or 4 batch size.

now its strange, as the data is not so big, and I have 24gb vram.

what could be the cause ? and what can I do here ?

Could you post a minimal and executable code snippet to reproduce the issue, please?
Also, could you post the output of python -m torch.utils.collect_env?

Hi, Thank you for your response.

code snippet

class BatchNorm(nn.Module):

    def __init__(self, input_dim: int, use_batch_normalization: bool = True, momentum: float = 0.1,
                 track_running_stats: bool = True) -> None:

        super().__init__()
        self.input_dim = input_dim
        self.momentum = momentum
        self.use_batch_normalization = use_batch_normalization
        if self.use_batch_normalization:
            self.batch_norm = nn.BatchNorm1d(input_dim, momentum=momentum, track_running_stats=track_running_stats)
        else:
            self.bias = Parameter(torch.zeros(input_dim, dtype=torch.float32), requires_grad=True)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass of batch normalization block.

        :param x: Input of shape `(N, D)` or `(N, K, D)` where `N = number of points`, `K = number of neighbors`, and
            `D = number of feature channels`.
        :type x: torch.Tensor
        :return: Normalized output of the same shape as the input.
        :rtype: torch.Tensor
        """

        if self.use_batch_normalization:
            if x.dim() == 2:
                return self.batch_norm(x)
            if x.dim() == 3:
                # (N, K, D) -> (N, D, K)
                x = x.transpose(1, 2).contiguous()
                # (N, D, K)
                output = self.batch_norm(x)
                # (N, D, K) -> (N, K, D)
                return output.transpose(1, 2).contiguous()

            raise ValueError(f"Input dimension of batch normalization block should be 2 or 3, got {x.dim()}.")
        else:
            return x + self.bias

    def __repr__(self) -> str:
        return 'BatchNormBlock(in_feat: {:d},' \
               ' momentum: {:.3f}, only_bias: {:s})'.format(self.input_dim,
                                                            self.momentum,
                                                            str(not self.use_batch_normalization))

which basically originate from

self.linear_pos_bias = nn.Sequential(OrderedDict([
                ('linear_1', nn.Linear(3, self.feature_dim, bias=False)),
                ('bn', BatchNorm(self.feature_dim)),
                ('relu', nn.ReLU(inplace=True)),
                ('linear_2', nn.Linear(self.feature_dim, self.feature_dim))
            ]))

Error:

  File "x:\Transformer.py", line 206, in forward
    peb = self.linear_pos_bias(pos)
  File "x:\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "x:\lib\site-packages\torch\nn\modules\container.py", line 204, in forward
    input = module(input)
  File "x:\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "x:\blocks\batch_norm.py", line 65, in forward
    output = self.batch_norm(x)
  File "x:\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "x:\lib\site-packages\torch\nn\modules\batchnorm.py", line 171, in forward
    return F.batch_norm(
  File "x:\lib\site-packages\torch\nn\functional.py", line 2450, in batch_norm
    return torch.batch_norm(
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

environment information:

Collecting environment information...
PyTorch version: 1.13.0+cu116
Is debug build: False
CUDA used to build PyTorch: 11.6
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro
GCC version: Could not collect
Clang version: Could not collect
CMake version: version 3.23.3
Libc version: N/A

Python version: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22621-SP0
Is CUDA available: True
CUDA runtime version: 11.6.55
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090 Ti
Nvidia driver version: 522.06
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\cudnn_ops_train64_8.dll
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==0.991
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.2
[pip3] torch==1.13.0+cu116
[pip3] torch-cluster==1.6.1+pt113cu116
[pip3] torch-geometric==2.3.1
[pip3] torch-scatter==2.1.1+pt113cu116
[pip3] torch-sparse==0.6.17+pt113cu116
[pip3] torch-spline-conv==1.2.2+pt113cu116
[pip3] torchaudio==0.13.0+cu116
[pip3] torchvision==0.14.0+cu116
[conda] Could not collect

Could you update to the latest PyTorch release and check if you would still see the issue?

Hello,

1.13.1+cu117 is the highest i can go making sure that all the tests pass in my code base, as the code base is yet to be migrated to the latest PyTorch version. which has to be done eventually, but till then if you could think of another workaround, please let me know!

thank you for your help.

PyTorch version: 1.13.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro
GCC version: Could not collect
Clang version: Could not collect
CMake version: version 3.23.3
Libc version: N/A

Python version: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22621-SP0
Is CUDA available: True
CUDA runtime version: 11.7.64
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090 Ti
Nvidia driver version: 522.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==1.5.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.25.2
[pip3] torch==1.13.1+cu117
[pip3] torch-cluster==1.6.1+pt113cu117
[pip3] torch-geometric==2.3.1
[pip3] torch-scatter==2.1.1+pt113cu117
[pip3] torch-sparse==0.6.17+pt113cu117
[pip3] torch-spline-conv==1.2.2+pt113cu117
[pip3] torchaudio==0.13.1+cu117
[pip3] torchvision==0.14.1+cu117
[conda] Could not collect