How to Overcome EEG Data Overfitting in Deep Learning Models?

Hi everyone,

I’m working on a deep learning model to analyze 4D EEG data (Batch × Frequency × Channels × Time), where the input is trial-wise power percentage values normalized relative to baseline. The trials come from multiple subjects, and I treat them as independent samples. I use stratified shuffling to split the data into training, validation, and test sets.

I’ve experimented with several architectures. The best-performing one so far is based on a Temporal Convolutional Network (using a third-party PyTorch implementation), which outperformed Conv1D+LSTM stacks in capturing single-trial EEG dynamics. However, I’m consistently running into overfitting issues.

I’ve tried a range of regularization and augmentation techniques:

  • Dropout, weight decay, and weight perturbations
  • Data augmentations like Gaussian noise, time masking, and other published methods
  • I even attempted synthetic data generation via GANs, but the generator performance was poor and didn’t help much

Despite these efforts, my model still overfits. Has anyone faced similar challenges when working with trial-wise EEG power data? I’d really appreciate any advice on:

  • Architectures that worked well for you
  • Data handling strategies (e.g., subject-wise splits, other normalization approaches)
  • Any augmentation techniques you found particularly helpful

I’m open to any suggestions. Honestly, I’m a bit stuck and would really value the community’s input.

Thanks in advance!