diff --git a/README.md b/README.md index 0c62b795fa29e69016f7e728c9e8d8ec023de8ad..f11a6638ce646ead39a2af97a096ddd2aa60c112 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ LAMBO is a machine learning model, which means it was trained to recognise bound LAMBO was developed in context of dependency parsing. Thus, it includes models trained on [Universal Dependencies treebanks](https://universaldependencies.org/#language-), uses `.conllu` as the training [data format](https://universaldependencies.org/conll18/evaluation.html) and supports integration with [COMBO](https://gitlab.clarin-pl.eu/syntactic-tools/combo), a state-of-the-art system for dependency parsing and more. However, you can use LAMBO as the first stage of any NLP process. -LAMBO currently includes models trained on 98 corpora in 53 languages. The full list is available in [languages.txt](src/lambo/data/languages.txt). For each of these, two model variants are available: +LAMBO currently includes models trained on 98 corpora in 53 languages. The full list is available in [languages.txt](src/lambo/resources/languages.txt). For each of these, two model variants are available: - simple LAMBO, trained on the UD corpus - pretrained LAMBO, same as above, but starting from weights pre-trained on unsupervised masked character prediction using multilingual corpora from [OSCAR](https://oscar-corpus.com/). @@ -43,9 +43,9 @@ Now you need to create a segmenter by providing the language your text is in, e. ``` lambo = Lambo.get('English') ``` -This will (if necessary) download the appropriate model from the online repository and load it. Note that you can use any language name (e.g. `Ancient_Greek`) or ISO 639-1 code (e.g. `fi`) from [languages.txt](src/lambo/data/languages.txt). +This will (if necessary) download the appropriate model from the online repository and load it. Note that you can use any language name (e.g. `Ancient_Greek`) or ISO 639-1 code (e.g. `fi`) from [languages.txt](src/lambo/resources/languages.txt). -Alternatively, you can select a specific model by defining LAMBO variant (`LAMBO` or `LAMBO_no_pretraining`) and training dataset from [languages.txt](src/lambo/data/languages.txt): +Alternatively, you can select a specific model by defining LAMBO variant (`LAMBO` or `LAMBO_no_pretraining`) and training dataset from [languages.txt](src/lambo/resources/languages.txt): ``` lambo = Lambo.get('LAMBO-UD_Polish-PDB') ``` @@ -136,4 +136,4 @@ If you use LAMBO in your research, please cite it as software: ## License -This project is licensed under the GNU General Public License v3.0. \ No newline at end of file +This project is licensed under the GNU General Public License v3.0.