Thank you for your answer, this is the results:
At num_workers = 0 -> 7, it work well:
num_workers=0
2.87158203125MB allocated
2.87158203125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
From 8->11, sometimes it raises bug while running, for example:
num_workers=11
2.87158203125MB allocated
2.87158203125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
1.14892578125MB allocated
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\giang\Desktop\DACON_landmark\test.py", line 5, in <module>
import torch
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\__init__.py", line 117, in <module>
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\giang\Desktop\DACON_landmark\test.py", line 5, in <module>
import torch
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\__init__.py", line 117, in <module>
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\giang\Desktop\DACON_landmark\test.py", line 5, in <module>
import torch
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\__init__.py", line 117, in <module>
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\giang\anaconda3\envs\working\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\giang\Desktop\DACON_landmark\test.py", line 5, in <module>
import torch
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\__init__.py", line 117, in <module>
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
Traceback (most recent call last):
File "test.py", line 82, in <module>
for i, data in enumerate(train_dataloader):
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\utils\data\dataloader.py", line 359, in __iter__
return self._get_iterator()
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\utils\data\dataloader.py", line 301, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\giang\anaconda3\envs\working\lib\site-packages\torch\utils\data\dataloader.py", line 885, in __init__
w.start()
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\giang\anaconda3\envs\working\lib\multiprocessing\popen_spawn_win32.py", line 72, in __init__
None, None, False, 0, env, None, None)
OSError: [WinError 1455] The paging file is too small for this operation to complete
From 12 ->14, sometimes the memory shows to be located in 2 or 3 epochs then raise the bug.
From 14-> 16, fail from the beginning.
It shows CUDA out-of-memory several times, but I checked on nvida-smi
the memory still empty.
My CPU: Intel core i7-10700K