Question about calculating n-gram probabilities

Hi!

I am new to Allen NLP. I would like to know if there is a class that would allow me to calculate the log probability of an n-gram given a particular corpus?

Cheers,
Andrew

Hi @andrewfr, I don’t believe we have that functionality. We’re mostly focused on deep models. Perhaps http://www.nltk.org/_modules/nltk/model/ngram.html could be of use?

Hi Brendan:

Thanks for the response. I found the following answer in Stackoverflow that addresses what I want:
https://stackoverflow.com/questions/54941966/how-can-i-calculate-perplexity-using-nltk.

The Moses statistical translation package also has the tools I need. A little inconvenient since I am programming in Python (I am using the constituency parser) but it should do what I want. Here is a link: https://masatohagiwara.net/training-an-n-gram-language-model-and-estimating-sentence-probability.html

Maybe it would be nice to add to the Allen NLP?

Again, thanks!

Cheers,
Andrew

Hi Andrew,

Glad you have something working! I think we’re unlikely to implement this in AllenNLP given that we’re focused on deep models and there’s already Python support from NLTK. That said, if you’re interested in making a feature request for this anything else you can file a Github issue here: https://github.com/allenai/allennlp/issues/new/choose.

Best,
Brendan