Skip to content
Snippets Groups Projects
Mateusz Klimaszewski's avatar
c053d5ee

COMBO

A language-independent NLP system for dependency parsing, part-of-speech tagging, lemmatisation and more built on top of PyTorch and AllenNLP.


License

Quick start

Clone this repository and install COMBO (we suggest creating a virtualenv/conda environment with Python 3.6+, as a bundle of required packages will be installed):

pip install -U pip setuptools wheel
pip install --index-url https://pypi.clarin-pl.eu/simple combo

Run the following commands in your Python console to make predictions with a pre-trained model:

from combo.predict import COMBO

nlp = COMBO.from_pretrained("polish-herbert-base")
sentence = nlp("COVID-19 to ostra choroba zakaźna układu oddechowego wywołana zakażeniem wirusem SARS-CoV-2.")

Predictions are accessible as a list of token attributes:

print("{:5} {:15} {:15} {:10} {:10} {:10}".format('ID', 'TOKEN', 'LEMMA', 'UPOS', 'HEAD', 'DEPREL'))
for token in sentence.tokens:
    print("{:5} {:15} {:15} {:10} {:10} {:10}".format(str(token.id), token.token, token.lemma, token.upostag, str(token.head), token.deprel))

Details