Random training freeze/ shutdown

I noticed I can reproduce the error pretty fast if I work with a lot of other application opened, the problem is that it crashes the all system and I’m not able to capture the messages coming from the debugger.
Would it help if I manage to upload a small snippet of code that causes me the error?

If your whole system crashes, it is almost sure hardware or OS configuration specific issue, and as such won’t manifest on a different machine. Sounds like your system is still ustable under a stress. Check system logs for hints. For linux freezes, try https://en.wikipedia.org/wiki/Magic_SysRq_key.

Maybe I’ve found something interesting.
I’m just starting to read through this, but maybe this is related https://bbs.archlinux.org/viewtopic.php?id=260131
I’ve posted all before my OOM manual kill, let me know if the other part of the log may be useful.

Nov 19 14:58:00 lorenzo-B450M-DS3H gnome-shell[10810]: [Child 10810, MediaDecoderStateMachine #1] WARNING: Decoder=7f2ed542cc00 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - RefPtr<MediaSourceTrackDemuxer::SamplesPromise> mozilla::MediaSourceTrackDemuxer::DoGetSamples(int32_t): manager is detached.: file /build/firefox-m6W5sP/firefox-83.0+build2/dom/media/MediaDecoderStateMachine.cpp:3471
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570530] gnome-shell: page allocation failure: order:4, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570538] CPU: 1 PID: 1570 Comm: gnome-shell Tainted: P           OE     5.4.0-53-generic #59-Ubuntu
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570539] Hardware name: Gigabyte Technology Co., Ltd. B450M DS3H/B450M DS3H-CF, BIOS F50 11/27/2019
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570540] Call Trace:
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570550]  dump_stack+0x6d/0x9a
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570555]  warn_alloc.cold+0x7b/0xdf
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570558]  __alloc_pages_slowpath+0xe07/0xe50
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570561]  ? __switch_to_asm+0x40/0x70
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570563]  ? __switch_to_asm+0x34/0x70
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570564]  ? __switch_to_asm+0x34/0x70
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570566]  ? __switch_to_asm+0x34/0x70
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570568]  ? __switch_to_asm+0x40/0x70
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570569]  ? __switch_to_asm+0x34/0x70
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570571]  ? get_page_from_freelist+0x6b/0x390
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570574]  __alloc_pages_nodemask+0x2d0/0x320
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570576]  alloc_pages_current+0x87/0xe0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570579]  kmalloc_order+0x1f/0x80
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570581]  kmalloc_order_trace+0x24/0xa0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570583]  __kmalloc+0x220/0x280
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570602]  nvkms_alloc+0x25/0x60 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570620]  _nv002715kms+0x16/0x30 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570636]  ? _nv002590kms+0x4e/0x1610 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570638]  ? prep_new_page+0x128/0x160
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570641]  ? __alloc_pages_nodemask+0x173/0x320
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570643]  ? alloc_pages_current+0x87/0xe0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570644]  ? kmalloc_order+0x63/0x80
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570646]  ? kmalloc_order_trace+0x24/0xa0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570659]  ? _nv000746kms+0x40/0x40 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570661]  ? __kmalloc+0x220/0x280
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570673]  ? _nv000746kms+0x40/0x40 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570694]  ? _nv002688kms+0x906/0xae0 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570707]  ? _nv000746kms+0x40/0x40 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570708]  ? __kmalloc+0x220/0x280
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570711]  ? _copy_from_user+0x3e/0x60
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570724]  ? _nv000746kms+0x40/0x40 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570737]  ? nvKmsIoctl+0x96/0x1d0 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570750]  ? nvkms_ioctl_common+0x42/0x80 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570763]  ? nvkms_ioctl+0xc7/0x100 [nvidia_modeset]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570928]  ? nvidia_frontend_unlocked_ioctl+0x42/0x50 [nvidia]
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570931]  ? do_vfs_ioctl+0x407/0x670
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570933]  ? __schedule+0x167/0x740
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570935]  ? ksys_ioctl+0x67/0x90
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570936]  ? schedule+0x42/0xb0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570938]  ? __x64_sys_ioctl+0x1a/0x20
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570941]  ? do_syscall_64+0x57/0x190
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570943]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570945] Mem-Info:
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570950] active_anon:2671662 inactive_anon:313824 isolated_anon:0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570950]  active_file:358546 inactive_file:464944 isolated_file:0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570950]  unevictable:12 dirty:12152 writeback:0 unstable:0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570950]  slab_reclaimable:64888 slab_unreclaimable:73117
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570950]  mapped:457864 shmem:387540 pagetables:21960 bounce:0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570950]  free:37952 free_pcp:498 free_cma:0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570953] Node 0 active_anon:10686648kB inactive_anon:1255296kB active_file:1434184kB inactive_file:1859776kB unevictable:48kB isolated(anon):0kB isolated(file):0kB mapped:1831456kB dirty:48608kB writeback:0kB shmem:1550160kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570954] Node 0 DMA free:15888kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15888kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570957] lowmem_reserve[]: 0 3418 15886 15886 15886
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570959] Node 0 DMA32 free:70812kB min:14528kB low:18160kB high:21792kB active_anon:1795464kB inactive_anon:279828kB active_file:721508kB inactive_file:518956kB unevictable:16kB writepending:14176kB present:3615412kB managed:3549388kB mlocked:16kB kernel_stack:3460kB pagetables:10800kB bounce:0kB free_pcp:860kB local_pcp:0kB free_cma:0kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570963] lowmem_reserve[]: 0 0 12468 12468 12468
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570965] Node 0 Normal free:65108kB min:52988kB low:66232kB high:79476kB active_anon:8890924kB inactive_anon:974408kB active_file:712376kB inactive_file:1340400kB unevictable:32kB writepending:34432kB present:13094400kB managed:12775296kB mlocked:32kB kernel_stack:16396kB pagetables:77040kB bounce:0kB free_pcp:1248kB local_pcp:0kB free_cma:0kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570968] lowmem_reserve[]: 0 0 0 0 0
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570970] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15888kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570977] Node 0 DMA32: 16*4kB (U) 5842*8kB (U) 1271*16kB (UE) 132*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 71360kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570983] Node 0 Normal: 1542*4kB (UMEH) 3047*8kB (UMEH) 1955*16kB (UMEH) 126*32kB (UMH) 3*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 66560kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570991] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570993] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570993] 1275749 total pagecache pages
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570995] 64601 pages in swap cache
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570996] Swap cache stats: add 1430974, delete 1366350, find 344947/510435
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570996] Free swap  = 0kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570997] Total swap = 2097148kB
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570997] 4181452 pages RAM
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570998] 0 pages HighMem/MovableOnly
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570998] 96309 pages reserved
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570998] 0 pages cma reserved
Nov 19 14:58:27 lorenzo-B450M-DS3H kernel: [18099.570999] 0 pages hwpoisoned
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: META_CURRENT_TIME used to choose focus window; focus window may not be correct.
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: Object Meta.BackgroundActor (0x5618f339c420), has been already deallocated â impossible to access it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: == Stack trace for context 0x5618f1b596e0 ==
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: #0   7ffee30ed500 b   resource:///org/gnome/gjs/modules/core/overrides/GObject.js:571 (330f04b6c40 @ 25)
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: #1   5618f4e43d28 i   /usr/share/gnome-shell/extensions/desktop-icons@csoriano/desktopGrid.js:209 (16f87789c088 @ 85)
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: #2   5618f4e43ca0 i   /usr/share/gnome-shell/extensions/desktop-icons@csoriano/desktopGrid.js:148 (16f877897e20 @ 12)
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: #3   5618f4e43c18 i   resource:///org/gnome/shell/ui/main.js:236 (330f04d5e98 @ 12)
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: JS ERROR: Error: Argument 'instance' (type interface) may not be null#012_init/GObject.Object.prototype.disconnect@resource:///org/gnome/gjs/modules/core/overrides/GObject.js:571:24#012_onDestroy@/usr/share/gnome-shell/extensions/desktop-icons@csoriano/desktopGrid.js:209:45#012_init/<@/usr/share/gnome-shell/extensions/desktop-icons@csoriano/desktopGrid.js:148:44#012_initializeUI/<@resource:///org/gnome/shell/ui/main.js:236:16
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: JS ERROR: TypeError: actor.get_meta_window(...) is null#012_destroyWindowDone@resource:///org/gnome/shell/ui/windowManager.js:1590:32#012onStopped@resource:///org/gnome/shell/ui/windowManager.js:1560:39#012_makeEaseCallback/<@resource:///org/gnome/shell/ui/environment.js:73:13#012_easeActor/<@resource:///org/gnome/shell/ui/environment.js:149:56#012_initializeUI/<@resource:///org/gnome/shell/ui/main.js:236:16
Nov 19 14:58:48 lorenzo-B450M-DS3H gnome-shell[1570]: message repeated 5 times: [ JS ERROR: TypeError: actor.get_meta_window(...) is null#012_destroyWindowDone@resource:///org/gnome/shell/ui/windowManager.js:1590:32#012onStopped@resource:///org/gnome/shell/ui/windowManager.js:1560:39#012_makeEaseCallback/<@resource:///org/gnome/shell/ui/environment.js:73:13#012_easeActor/<@resource:///org/gnome/shell/ui/environment.js:149:56#012_initializeUI/<@resource:///org/gnome/shell/ui/main.js:236:16]
Nov 19 14:58:48 lorenzo-B450M-DS3H kernel: [18120.969471] audit: type=1107 audit(1605794328.542:3215): pid=868 uid=103 auid=4294967295 ses=4294967295 msg='apparmor="DENIED" operation="dbus_signal"  bus="system" path="/org/freedesktop/NetworkManager" interface="org.freedesktop.NetworkManager" member="CheckPermissions" name=":1.11" mask="receive" pid=4784 label="snap.spotify.spotify" peer_pid=869 peer_label="unconfined"
Nov 19 14:58:48 lorenzo-B450M-DS3H kernel: [18120.969471]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Nov 19 14:58:57 lorenzo-B450M-DS3H gsd-media-keys[1707]: Failed to grab accelerators: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: No such interface âorg.gnome.Shellâ on object at path /org/gnome/Shell
Nov 19 14:58:57 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:57 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:57 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:57 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:57 lorenzo-B450M-DS3H gnome-shell[15759]: current session already has an ibus-daemon.
Nov 19 14:58:58 lorenzo-B450M-DS3H gsd-media-keys[1707]: Failed to grab accelerators: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: No such interface âorg.gnome.Shellâ on object at path /org/gnome/Shell
Nov 19 14:58:58 lorenzo-B450M-DS3H dbus-daemon[868]: [system] Activating via systemd: service name='org.freedesktop.GeoClue2' unit='geoclue.service' requested by ':1.258' (uid=1000 pid=1570 comm="/usr/bin/gnome-shell " label="unconfined")
Nov 19 14:58:58 lorenzo-B450M-DS3H systemd[1]: Starting Location Lookup Service...
Nov 19 14:58:58 lorenzo-B450M-DS3H kernel: [18130.680111] audit: type=1107 audit(1605794338.254:3216): pid=868 uid=103 auid=4294967295 ses=4294967295 msg='apparmor="DENIED" operation="dbus_signal"  bus="system" path="/org/freedesktop/NetworkManager" interface="org.freedesktop.NetworkManager" member="CheckPermissions" name=":1.11" mask="receive" pid=4784 label="snap.spotify.spotify" peer_pid=869 peer_label="unconfined"
Nov 19 14:58:58 lorenzo-B450M-DS3H kernel: [18130.680111]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[1570]: Telepathy is not available, chat integration will be disabled.
Nov 19 14:58:58 lorenzo-B450M-DS3H dbus-daemon[868]: [system] Successfully activated service 'org.freedesktop.GeoClue2'
Nov 19 14:58:58 lorenzo-B450M-DS3H systemd[1]: Started Location Lookup Service.
Nov 19 14:58:58 lorenzo-B450M-DS3H dbus-daemon[868]: [system] Activating via systemd: service name='org.freedesktop.PackageKit' unit='packagekit.service' requested by ':1.258' (uid=1000 pid=1570 comm="/usr/bin/gnome-shell " label="unconfined")
Nov 19 14:58:58 lorenzo-B450M-DS3H systemd[1]: Starting PackageKit Daemon...
Nov 19 14:58:58 lorenzo-B450M-DS3H PackageKit: daemon start
Nov 19 14:58:58 lorenzo-B450M-DS3H dbus-daemon[868]: [system] Successfully activated service 'org.freedesktop.PackageKit'
Nov 19 14:58:58 lorenzo-B450M-DS3H systemd[1]: Started PackageKit Daemon.
Nov 19 14:58:58 lorenzo-B450M-DS3H dbus-daemon[1375]: [session uid=1000 pid=1375] Activating service name='org.gnome.Shell.Notifications' requested by ':1.206' (uid=1000 pid=1570 comm="/usr/bin/gnome-shell " label="unconfined")
Nov 19 14:58:58 lorenzo-B450M-DS3H dbus-daemon[1375]: [session uid=1000 pid=1375] Successfully activated service 'org.gnome.Shell.Notifications'
Nov 19 14:58:58 lorenzo-B450M-DS3H NetworkManager[869]: <info>  [1605794338.6918] agent-manager: agent[7908de332b92b7d8,:1.258/org.gnome.Shell.NetworkAgent/1000]: agent registered
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: IPDL protocol error: Handler returned error code!
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[10173]: ###!!! [Parent][DispatchAsyncMessage] Error: PLayerTransaction::Msg_ReleaseLayer Processing error: message was deserialized, but the handler returned false (indicating failure)
Nov 19 14:58:58 lorenzo-B450M-DS3H gnome-shell[1570]: Error looking up permission: GDBus.Error:org.freedesktop.portal.Error.NotFound: No entry for geolocation
Nov 19 14:58:59 lorenzo-B450M-DS3H gsd-media-keys[1707]: Failed to grab accelerator for keybinding settings:rfkill
Nov 19 14:58:59 lorenzo-B450M-DS3H gsd-media-keys[1707]: Failed to grab accelerator for keybinding settings:playback-random
Nov 19 14:58:59 lorenzo-B450M-DS3H gsd-media-keys[1707]: Failed to grab accelerator for keybinding settings:hibernate
Nov 19 14:58:59 lorenzo-B450M-DS3H gsd-media-keys[1707]: Failed to grab accelerator for keybinding settings:playback-repeat
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 31 with keysym 31 (keycode a).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 32 with keysym 32 (keycode b).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 33 with keysym 33 (keycode c).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 34 with keysym 34 (keycode d).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 38 with keysym 38 (keycode 11).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 39 with keysym 39 (keycode 12).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 35 with keysym 35 (keycode e).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 36 with keysym 36 (keycode f).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Window manager warning: Overwriting existing binding of keysym 37 with keysym 37 (keycode 10).
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: GNOME Shell started at Thu Nov 19 2020 14:58:58 GMT+0100 (CET)
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Registering session with GDM
Nov 19 14:58:59 lorenzo-B450M-DS3H gnome-shell[1570]: Error registering session with GDM: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.gnome.DisplayManager was not provided by any .service files
Nov 19 14:59:01 lorenzo-B450M-DS3H kernel: [18134.399942] sysrq: Manual OOM execution

This basically says you’re out of memory, and then perhaps it is not handled gracefully. Probably means GPU memory, but this can be conflated. You can use nvidia-smi to find offending apps (firefox is pretty bad gpu mem hog for sure).

That’s not the case though.The weird thing is that I’m just loading into ram. Now I’ve removed pinned_memory idk if that may be the cause of this error.

The only thing that i have on my GPU memory is the model. So it looks like this all the time:

Thu Nov 19 17:29:38 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    Off  | 00000000:06:00.0  On |                  N/A |
| 38%   37C    P8     9W / 160W |   1440MiB /  5931MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     15862      G   /usr/lib/xorg/Xorg                158MiB |
|    0   N/A  N/A     16221      G   /usr/bin/gnome-shell              107MiB |
|    0   N/A  N/A     17855      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     19013      C   python3                          1127MiB |
|    0   N/A  N/A     19078      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     19796      G   .../debug.log --shared-files       35MiB |
+-----------------------------------------------------------------------------+

Then it is about ram:

If I read it right, this means 16GB + 2GB swap, with such swap your system can’t deal with ram usage spikes.

Still, i feel like this is just a symptom of something wrong. On average I have 5 gb of free ram and 1.8gb of free swap while training with 3 workers.
Doesn’t really make sense to me that with 10gb used i get 7+ gb of ram spikes, while loading the data, also because they are all the same size more or less. Also I’m noticing that every now and then python seems to stop processing (very low cpu usage).
I suppose if it stays in this state for too long my workers get timed out.
I’m ordering new RAM, I’ll update if this solves the problem in a few days.

Nah, 7gb reserve is not a lot nowadays, if you also consider 1)fragmentation and overallocations 2)apps with garbage collection 3)modern web and other unreasonably bulky apps. I’d create a 16gb swap file, as there is a lot of discardable stuff in memory.

ok, got it.
buy other ram and create a bigger swap file. I hope this will be all for this thread, thank you a lot for your help.

To troubleshoot the freeze issues, check the current status of your computer, and follow one of the following methods.

For the computer that’s still running in a frozen state
If the physical computer or the virtual machine is still freezing, use one or more of the following methods for troubleshooting:

Try to access the computer through a remote desktop connection.
Use a domain account or local administrator account to sign in to the computer with the hardware manufacturer’s remote access solution. For example, Dell Remote Access Card (DRAC), HP Integrated Lights-Out (iLo), or IBM Remote supervisor adapter (RSA).
Test ping to the computer. Look for dropped packets and high network latency.
Access administrative shares, for example \ServerName\c$.
Press Ctrl+Alt+Delete and check the response.
Try to use Windows remote administration tools. For example, Computer Management, Server Manager, and Wmimgmt.msc.

Regards,
Rachel Gomez