The error is:
[ 1361.908162] NVRM: GPU at PCI:0000:d8:00: GPU-d1a5f877-65cb-a62e-4192-ae05bb68fc48
[ 1361.908175] NVRM: GPU Board Serial Number: 1560121001476
[ 1361.908178] NVRM: Xid (PCI:0000:d8:00): 79, pid=0, GPU has fallen off the bus.
[ 1361.908186] NVRM: GPU 0000:d8:00.0: GPU has fallen off the bus.
[ 1361.908191] NVRM: GPU 0000:d8:00.0: GPU serial number is 1560121001476.
[ 1361.908210] NVRM: A GPU crash dump has been created. If possible, please run
NVRM: nvidia-bug-report.sh as root to collect this data before
NVRM: the NVIDIA kernel module is unloaded.
Based on this table it could be caused by a:
- HW error
- Driver issue
- System Memory Corruption
- Bus Error
- Thermal Issue
A while ago a user was seeing the same issue and realized that the power cable wasn’t properly plugged into the GPU, which caused the same Xid, so you might want to start with this.