Captum with BERT on AWS inferentia

Hi All,

I have to compute gradient on BERT model on inferentia. For this I guess I also need access to the hidden layers. Im currently not able to proceed because of not finding literature on the net for inferentia. There seems to be some non inferentia links but those do not work for this case.

Amongst the things which are missing are how to get positional embeddings.

Or maybe using captum there may be another way.
Please provide some code snippets to help me out.

Thanks in advance Ajay