Skip to content
Snippets Groups Projects
Select Git revision
  • 8f0f24ac07f0b927783658054e2406f1c81aa8fc
  • main default protected
  • ud_training_script
  • fix_seed
  • merged-with-ner
  • multiword_fix_transformer
  • transformer_encoder
  • combo3
  • save_deprel_matrix_to_npz
  • master protected
  • combo-lambo
  • lambo-sent-attributes
  • adding_lambo
  • develop
  • update_allenlp2
  • develop_tmp
  • tokens_truncation
  • LR_test
  • eud_iwpt
  • iob
  • eud_iwpt_shared_task_bert_finetuning
  • 3.3.1
  • list
  • 3.2.1
  • 3.0.3
  • 3.0.1
  • 3.0.0
  • v1.0.6
  • v1.0.5
  • v1.0.4
  • v1.0.3
  • v1.0.2
  • v1.0.1
  • v1.0.0
34 results

combo

  • Clone with SSH
  • Clone with HTTPS
  • COMBO

    A language-independent NLP system for dependency parsing, part-of-speech tagging, lemmatisation and more built on top of PyTorch and AllenNLP.


    License

    Quick start

    Clone this repository and install COMBO (we suggest creating a virtualenv/conda environment with Python 3.6+, as a bundle of required packages will be installed):

    pip install -U pip setuptools wheel
    pip install --index-url https://pypi.clarin-pl.eu/simple combo==1.0.3

    Run the following commands in your Python console to make predictions with a pre-trained model:

    from combo.predict import COMBO
    
    nlp = COMBO.from_pretrained("polish-herbert-base")
    sentence = nlp("COVID-19 to ostra choroba zakaźna układu oddechowego wywołana zakażeniem wirusem SARS-CoV-2.")

    Predictions are accessible as a list of token attributes:

    print("{:5} {:15} {:15} {:10} {:10} {:10}".format('ID', 'TOKEN', 'LEMMA', 'UPOS', 'HEAD', 'DEPREL'))
    for token in sentence.tokens:
        print("{:5} {:15} {:15} {:10} {:10} {:10}".format(str(token.id), token.token, token.lemma, token.upostag, str(token.head), token.deprel))

    COMBO tutorial

    We encourage you to use the beginner's tutorial (colab notebook).

    Details