diff --git a/docs/models.md b/docs/models.md index e7a62cff3089dd9cc69bffe17b31e91fc86853f8..1c2eff8fbc34789baa68a24917e0ee48c65dacf1 100644 --- a/docs/models.md +++ b/docs/models.md @@ -5,7 +5,16 @@ COMBO provides pre-trained models for: - enhanced dependency parsing trained on IWPT 2020 shared task [data](https://universaldependencies.org/iwpt20/data.html) ([Bouma et al. 2020](https://www.aclweb.org/anthology/2020.iwpt-1.16.pdf)). ## Pre-trained models +<!--- All **pre-trained models** for different languages and their **evaluation results** are listed in the spreadsheets: [UD-trained COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit?usp=sharing) and [enhanced COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=1757180324). +--> +**Morphosyntactic prediction models** trained on the selected UD treebanks version 2.7 and their **evaluation results** are listed in [Model performance (UD2.7)](https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md) table. + +**Morphosyntactic prediction models** trained on the seleted UD treebanks version 2.5 and **enhanced parsing models** are listed in the spreadsheets: [UD2.5-trained COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=0) and [enhanced COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=1757180324). + +<!--- +Please notice that the name in the brackets matches the name used in [Automatic Download](models.md#Automatic download).) +--> ### License Models are distributed under the same license as datasets used for their training. @@ -14,8 +23,11 @@ See [Universal Dependencies v2.7 License Agreement](https://lindat.mff.cuni.cz/r ## Automatic download -The pre-trained models can be automatically downloaded with the `from_pretrained` method in the Python mode. Select a model name from the pre-trained model lists (see the column **Model name** in [UD-trained COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit?usp=sharing) and [enhanced COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=1757180324)) and pass the name as an attribute of the `from_pretrained` method: +The pre-trained models can be automatically downloaded with the `from_pretrained` method in the Python mode. Select the model name of a pre-trained model (see the column **Model name** in [Model performance (UD2.7)](https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md), [UD2.5-trained COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=0) and [enhanced COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=1757180324)) and pass the name as an attribute of the `from_pretrained` method: +<!--- +[pre-trained models](http://mozart.ipipan.waw.pl/~mklimaszewski/models/) and pass the name as the attribute to `from_pretrained` method: +---> ```python from combo.predict import COMBO @@ -23,9 +35,20 @@ nlp = COMBO.from_pretrained("polish-herbert-base") ``` If the model name doesn't match any model on the pre-trained model lists, COMBO looks for a model in local env. +<!--- + of [pre-trained models](c), COMBO looks for a model in the local env. +---> + ## Manual download -The pre-trained models can be manually downloaded to a local disk with the `wget` package. You need to manually download a pre-trained model, if you want to use COMBO in the command-line mode. The links to the pre-trained models are listed in the column **Model link** in [UD-trained COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit?usp=sharing) and [enhanced COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=1757180324). +If you want to use COMBO in the command-line mode, you need to manually download a pre-trained model. The pre-trained models can be manually downloaded to a local disk with the `wget` package. The links to the pre-trained models are listed in the column **Model name** in [Model performance (UD2.7)](https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md), or **Model link** in [UD2.5-trained COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=0) and [enhanced COMBO models](https://docs.google.com/spreadsheets/d/1WFYc2aLRa1jw7le030HOacv9fc4zmtqiZtRQY6gl5mc/edit#gid=1757180324). + +<!--- +from [here](http://mozart.ipipan.waw.pl/~mklimaszewski/models/). + + +If you want to use the console version of COMBO, you need to download a pre-trained model manually: +---> ```bash wget http://mozart.ipipan.waw.pl/~mklimaszewski/models/polish-herbert-base.tar.gz