Masked AutoEncoder Pre-training

samuel3 · April 8, 2024, 3:20am

Hi everyone,

A few questions here.

At what epoch do you notice that masked autoencoders start producing similar-looking reconstructions during the pretraining phase? I’ve experimented on pretraining the BTCV dataset and notice that after ~800 epochs, the results are really no different than the first 20 or so.

My loss never really drops before 0.9 either. I’m using a low learning rate and weight decay following the hyperparameters of the original paper. Is this common?

Also, it best to only move to the downstream task (in my case segmentation) once the reconstructions look decent?

gayalkuruppu · May 5, 2025, 4:58pm

Were you able to solve this issue? I also faced the same issue (celeba dataset), tried both mae/models_mae.py at main · facebookresearch/mae · GitHub and vit-pytorch/vit_pytorch/mae.py at main · lucidrains/vit-pytorch · GitHub. Appreciate any leads.