BERT set to eval mode when requires_grad=False


I was wondering if there was a way to turn the dropout and layer-norm layers in BERT to eval mode during training when we set the requires_grad parameter to False for PretrainedBert here - ?

The problem I see is that the allennlp trainer loop call model.train() on the whole model and will turn back the BERT to train mode even if I modify the initialization code above to set to eval model.

My current setup is to set the layers to eval mode during forward call of the model .
Is there a better way ?

Hmm, sounds like we’d want to override model.train() to handle this properly. It also sounds a bit messy to get correct for everything, but if you can think of a clean solution, I think this is definitely a problem that we’d want to fix in the library. Feel free to open an issue about this in the repo, and I’ll mark it as “contributions welcome”.

You should be able to override train() on your own model class, also. That would be a good way to test this to see if it’s possible to do it in a clean way that will generalize to other models. If you can, then a PR to add it to the base model class would be lovely.