Allennlp.org website has broken links to tutorials

This tutorials page on the allennlp site has several broken links. Near the bottom of the page, it links to:

Where are the best alternative locations for that information?

Those tutorials have all been replaced by guide.allennlp.org. If you notice, we don’t actually link to the old tutorial anymore. It’ll be removed from the website soon, possibly with a redirect to the guide.

Thank you, @mattg

Where can we find the tutorial on training ELMo embeddings with our own corpus?

Is this what you’re looking for? Whatever code is there might need to be updated to work with version 1.0.

Yes, thank you.

(I hope that tutorial will soon be updated and migrated to the guide.allennlp.org site.)

Hi @mattg , can you please clarify something:

This guide that you originally linked to covers just using a pre-trained ELMo model, correct?

But if we want to completely retrain from scratch using our custom corpus/dataset, then we first need to follow this guide? Is that correct?

Yes, that’s correct. There’s also this repository, which was used to train the initial version of ELMo (not the transformer version).

Thanks. I have seen that bilm-tf repo, but that is the TensorFlow implementation, rather than AllenNLP+PyTorch implementation, correct?

There is no pytorch implementation of the original EMLo bi-LM. It was done in tensorflow, before allennlp existed.

1 Like

Follow-up question regarding this passage from this tutorial:

This document describes how to train and use a transformer-based version of ELMo with allennlp . The model is a port of the the one described in Dissecting Contextual Word Embeddings: Architecture and Representation by Peters et al.

Is there any paper or detailed explanation about this port (to transformer-based ELMo)? There are some good docs explaining transformer in general (with example use case in Attention Is All You Need paper being machine translation), and the original ELMo paper by Peters, et al uses a bidirectional LSTM.

Is there an in-depth explanation somewhere about transformer-based version of ELMo?

The only explanation I know of is the one in the paper you linked to.

1 Like