From 82ba2f40f0366cf631ebdfc621d37b4f344232a5 Mon Sep 17 00:00:00 2001
From: Maja Jablonska <majajjablonska@gmail.com>
Date: Mon, 20 Nov 2023 22:39:49 +1100
Subject: [PATCH] Prediction.md and Troubleshooting.md

---
 docs/Prediction.md      | 66 +++++++++++++++++++++++++++++++++++++++++
 docs/Troubleshooting.md | 13 ++++++++
 2 files changed, 79 insertions(+)
 create mode 100644 docs/Prediction.md
 create mode 100644 docs/Troubleshooting.md

diff --git a/docs/Prediction.md b/docs/Prediction.md
new file mode 100644
index 0000000..7f836de
--- /dev/null
+++ b/docs/Prediction.md
@@ -0,0 +1,66 @@
+# Prediction
+
+## COMBO as a Python library
+
+The pre-trained models can be automatically downloaded with the ```from_pretrained```
+method. Select a model name from the lists: UD-trained COMBO models and pass it as an argument of from_pretrained.
+
+```python
+from combo.predict import COMBO
+
+nlp = COMBO.from_pretrained("model-prototype")
+sentence = nlp("Sentence to parse.")
+```
+
+You can also load your own COMBO model:
+
+```python
+from combo.predict import COMBO
+
+model_path = "your_model.tar.gz"
+nlp = COMBO.from_pretrained(model_path)
+sentence = nlp("Sentence to parse.")
+```
+
+COMBO allows to enter presegmented sentences (or texts):
+
+```python
+from combo.predict import COMBO
+
+model_path = "your_model.tar.gz"
+nlp = COMBO.from_pretrained(model_path)
+tokenized_sentence = ["Sentence", "to", "parse", "."]
+sentence = nlp([tokenized_sentence])
+```
+
+By default, COMBO uses the LAMBO tokenizer.
+
+## COMBO as a command-line interface
+
+Input and output are both in ```*.conllu``` format.
+
+```bash
+combo --mode predict --model_path your_model_tar_gz --input_file your_conllu_file --output_file your_output_file
+```
+
+### Raw text prediction
+
+Works for models where input was text-based only.
+
+Input: one sentence per line.
+
+Output: CONLL-u file.
+
+```bash
+combo --mode predict --model_path your_model_tar_gz --input_file your_text_file --output_file your_output_file --noconllu_format
+```
+
+### Console prediction
+
+Works for models where input was text-based only.
+
+Interactive testing in console (load model and just type sentence in console).
+
+```bash
+combo --mode predict --model_path your_model_tar_gz --input_file "-"
+```
\ No newline at end of file
diff --git a/docs/Troubleshooting.md b/docs/Troubleshooting.md
new file mode 100644
index 0000000..37dd340
--- /dev/null
+++ b/docs/Troubleshooting.md
@@ -0,0 +1,13 @@
+# A few common problems
+
+## Downloading a model
+
+When downloading a model using the ```from_pretrained``` method, the downloaded file might be
+incomplete, e.g. due to a network error. The following error:
+
+```
+EOFError: Compressed file ended before the end-of-stream marker was reached
+```
+
+means that the cache directory (by default ```$HOME/.combo```) contains a corrupted file.
+Deleting such a file and downloading the model again should help.
\ No newline at end of file
-- 
GitLab