I am finding the documentation for actually using (not just installing) the “officially supported” allennlp-models to be quite sparse.
The README on GitHub covers simply the installation.
Clicking through to some of the individual models, these pages are simply web-formatted versions of the docstrings for each class. But what I am really looking for are some examples, guidance, and best practices for using these.
Specifically, I am curious to learn:
- Training and inference of a bidirectional LSTM with CRF, for NER task. Ideally with examples with custom dataset.
- What is the difference between tagging/models/crf_tagger.py (in the “models” repo) and predictors/sentence_tagger.py (in the main repo). When to use one over the other? How do they integrate, if at all?
I fell into this rabbit hole of confusion by going through the Common Architectures section of the “main guide”. It includes this paragraph:
A CRF sequential tagger is implemented as the
CrfTaggermodule. It takes a sequence of indexed tokens, embeds them using a
TextFieldEmbedder, encodes them with a
Seq2SeqEncoder, applies a
TimeDistributedlinear layer, and finally applies a
The link to
CrfTagger module is broken. Also, should it not be
CrfTagger model, rather than module?
I am liking AllenNLP overall and think it has tremendous potential. However, the documentation has a disjointed feel and leads to confusion and making the learning curve unncessarily steeper. I’d like to help contribute on that front, but I feel as if I’m in a catch-22: I need a clean and complete set of docs to really understand this project well enough.