Hi all,
I am a beginner of PyTorch and CV. I encounter a problem when trying to use mmaction2 to extract features from video clips. Following the tutorial from here, I tried to run a single video test and my command is
python3 tools/misc/clip_feature_extraction.py \
configs/recognition/i3d/i3d_r50_video_32x2x1_100e_kinetics400_rgb.py \
pretrained/i3d_r50_video_32x2x1_100e_kinetics400_rgb_20200826-e31c6f52.pth \
--video-list examples/inputs/video_list_single.txt \
--video-root examples/inputs/video \
--out examples/outputs/examples_feature.pkl
However, I got the a RuntimeError: CUDA error: unknown error
.
load checkpoint from local path: pretrained/i3d_r50_video_32x2x1_100e_kinetics400_rgb_20200826-e31c6f52.pth
[ ] 0/1, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/misc/clip_feature_extraction.py", line 229, in <module>
main()
File "tools/misc/clip_feature_extraction.py", line 217, in main
outputs = inference_pytorch(args, cfg, distributed, data_loader)
File "tools/misc/clip_feature_extraction.py", line 118, in inference_pytorch
outputs = single_gpu_test(model, data_loader)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/engine/test.py", line 33, in single_gpu_test
result = model(return_loss=False, **data)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 50, in forward
return super().forward(*inputs, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/recognizers/base.py", line 264, in forward
return self.forward_test(imgs, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/recognizers/recognizer3d.py", line 99, in forward_test
return self._do_test(imgs).cpu().numpy()
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/recognizers/recognizer3d.py", line 63, in _do_test
feat = self.extract_feat(imgs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/recognizers/base.py", line 163, in extract_feat
x = self.backbone(imgs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/backbones/resnet3d.py", line 854, in forward
x = res_layer(x)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/backbones/resnet3d.py", line 318, in forward
out = _inner_forward(x)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmaction/models/backbones/resnet3d.py", line 305, in _inner_forward
out = self.conv1(x)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py", line 201, in forward
x = self.conv(x)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/wrappers.py", line 80, in forward
return super().forward(x)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 590, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/xxx/miniconda3/envs/mmlab/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 586, in _conv_forward
input, weight, bias, self.stride, self.padding, self.dilation, self.groups
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
If I set CUDA_LAUNCH_BLOCKING=1
, i.e., CUDA_LAUNCH_BLOCKING=1 python3 ...
, nothing more is shown.
I am not sure what causes the error, but I guess might be CUDA or PyTorch setup problems, since the codes can work properly on the other machine. FYI, I list the environment of the two machine below.
Device 1 (has error) | Device 2 (no error) | |
---|---|---|
Platform | WSL2, Ubuntu 20.04.3 | WSL2, Ubuntu 20.04.3 |
GPU | GeForce GTX 1080 Ti, Driver=510.06, CUDA=11.6 | GeForce RTX 2060, Driver=510.06, CUDA=11.6 |
PyTorch | pytorch=1.10.1, py=3.7, cuda=11.3.1 | pytorch=1.10.1, py=3.7, cuda=11.3.1 |
My question is what causes the error and how I can fix it? Thanks very much.