@@ -10,7 +10,7 @@ LAMBO is a machine learning model, which means it was trained to recognise bound
...
@@ -10,7 +10,7 @@ LAMBO is a machine learning model, which means it was trained to recognise bound
LAMBO was developed in context of dependency parsing. Thus, it includes models trained on [Universal Dependencies treebanks](https://universaldependencies.org/#language-), uses `.conllu` as the training [data format](https://universaldependencies.org/conll18/evaluation.html) and supports integration with [COMBO](https://gitlab.clarin-pl.eu/syntactic-tools/combo), a state-of-the-art system for dependency parsing and more. However, you can use LAMBO as the first stage of any NLP process.
LAMBO was developed in context of dependency parsing. Thus, it includes models trained on [Universal Dependencies treebanks](https://universaldependencies.org/#language-), uses `.conllu` as the training [data format](https://universaldependencies.org/conll18/evaluation.html) and supports integration with [COMBO](https://gitlab.clarin-pl.eu/syntactic-tools/combo), a state-of-the-art system for dependency parsing and more. However, you can use LAMBO as the first stage of any NLP process.
LAMBO currently includes models trained on 98 corpora in 53 languages. The full list is available in `[languages.txt](src/lambo/resources/languages.txt)`. For each of these, two model variants are available:
LAMBO currently includes models trained on 98 corpora in 53 languages. The full list is available in [`languages.txt`](src/lambo/resources/languages.txt). For each of these, two model variants are available:
- simple LAMBO, trained on the UD corpus
- simple LAMBO, trained on the UD corpus
- pretrained LAMBO, same as above, but starting from weights pre-trained on unsupervised masked character prediction using multilingual corpora from [OSCAR](https://oscar-corpus.com/).
- pretrained LAMBO, same as above, but starting from weights pre-trained on unsupervised masked character prediction using multilingual corpora from [OSCAR](https://oscar-corpus.com/).