Documentation and examples for using allennlp-models

I am finding the documentation for actually using (not just installing) the “officially supported” allennlp-models to be quite sparse.

The README on GitHub covers simply the installation.

The “docs” page for it seems also similarly focused on installation. (Also, the stable version leads to a 404 page.)

Clicking through to some of the individual models, these pages are simply web-formatted versions of the docstrings for each class. But what I am really looking for are some examples, guidance, and best practices for using these.

Specifically, I am curious to learn:

  • Training and inference of a bidirectional LSTM with CRF, for NER task. Ideally with examples with custom dataset.
  • What is the difference between tagging/models/ (in the “models” repo) and predictors/ (in the main repo). When to use one over the other? How do they integrate, if at all?

I fell into this rabbit hole of confusion by going through the Common Architectures section of the “main guide”. It includes this paragraph:

A CRF sequential tagger is implemented as the CrfTagger module. It takes a sequence of indexed tokens, embeds them using a TextFieldEmbedder , encodes them with a Seq2SeqEncoder , applies a TimeDistributed linear layer, and finally applies a ConditionalRandomField module.

The link to CrfTagger module is broken. Also, should it not be CrfTagger model, rather than module?

I am liking AllenNLP overall and think it has tremendous potential. However, the documentation has a disjointed feel and leads to confusion and making the learning curve unncessarily steeper. I’d like to help contribute on that front, but I feel as if I’m in a catch-22: I need a clean and complete set of docs to really understand this project well enough.

Thanks for the feedback! Trying to address your specific questions (let me know if I missed something):

  • To get usage information, the best place right now is probably to look at the “Usage” tab for particular models in our demo: There isn’t a training command for that particular model there, because we can’t release the data, but if you have the data, it’s similar to the training commands for other models that you can find in the demo. The quick start section of the guide also should give a decent walkthrough of how to train a model. Though it just uses a simple classifier, not an NER model, the process is the same for NER.
  • For the difference between a model and a predictor, the best place to look is currently here.
  • Thanks for pointing out the bugs in the guide; I have fixed them.

We are always happy to take PRs to make things better, especially for our documentation.