Speeding up epochs for test runs

say i added a code that runs every epoch and i want to test/debug it. epoch takes long to run so i wanna shorten it. i am willing to give up “actual” learning since this is only a test run.
i thought of taking a subset of my data but it’s breaking a couple of places in my code so i’d rather not.
i thought of skipping the backprop phase (loss.backwards()) but somehow it take up extra space exceeding my gpu’s limit.
what other options do i have?