-
Alina Wróblewska authored196187ac
Training
Basic command:
combo --mode train \
--training_data_path your_training_path \
--validation_data_path your_validation_path
Options:
combo --helpfull
Examples (for clarity without training/validation data paths):
-
train on gpu 0
combo --mode train --cuda_device 0
-
use pretrained embeddings:
combo --mode train --pretrained_tokens your_pretrained_embeddings_path --embedding_dim your_embeddings_dim
-
use pretrained transformer embeddings:
combo --mode train --pretrained_transformer_name your_choosen_pretrained_transformer
-
train only a dependency parser:
combo --mode train --targets head,deprel
-
use additional features (e.g. part-of-speech tags) for training a dependency parser (
token
andchar
are default features)combo --mode train --targets head,deprel --features token,char,upostag
Enhanced Dependencies
Enhanced Dependencies are described here. Training an enhanced graph prediction model requires data pre-processing.
Data pre-processing
The organisers of IWPT20 shared task distributed the data sets and a data pre-processing script enhanced_collapse_empty_nodes.pl
. If you wish to train a model on IWPT20 data, apply this script to the training and validation data sets, before training the COMBO EUD model.
perl enhanced_collapse_empty_nodes.pl training.conllu > training.fixed.conllu
Training EUD model
combo --mode train \
--training_data_path your_preprocessed_training_path \
--validation_data_path your_preprocessed_validation_path \
--targets feats,upostag,xpostag,head,deprel,lemma,deps \
--config_path config.graph.template.jsonnet
Configuration
Advanced
Config template config.template.jsonnet is formed in allennlp
format so you can freely modify it.
There is configuration for all the training/model parameters (learning rates, epochs number etc.).
Some of them use jsonnet
syntax to get values from configuration flags, however most of them can be modified directly there.