Creating the Predictor from PyTorch model

Hi, I’m new with Allen NLP, so ask to help me with creating the Predictor to use Interpret.
So, I have pretrained PyTorch model, based on Huggingface’s Transformers (I use pretreined encoder). I have torch’s Dataset object, that adds special tokens in a specific way (I have 3 sentences instead of 2), so as I understood from some tutorials I need to create AllenNLP’s DatasetReader where I need to specify token_indexers (that has to be created from PretrainedTransformerIndexer) is it correct?

But as I add special tokens in a specific way, do I need to override the encode_plus transformer’s tokenizer’s method as it’s used in PretrainedTransformerTokenizer’s tokenize method?

It sounds like the answer to your question is probably yes. The safest thing to do to check would be to make a simple test for your dataset reader to make sure that it’s doing the right tokenization / adding of special tokens. If it turns out that this is impossible because of assumptions that we make in the PretrainedTransformerTokenizer, then open an issue on github.