Commit d6961fef authored by Michal Pogoda's avatar Michal Pogoda

Language is now optional and defaults to 'pl'

parent 978ef6b8
Pipeline #3670 passed with stages
in 3 minutes and 16 seconds
......@@ -10,15 +10,17 @@ A service that automatically adds punctuation to raw word-stream (eg. from speec
## Config
```ini
[deployment]
device = cpu ; Device on which inference will be made (eg. cpu, cuda:0 etc)
model_path = /model/punctuator ; Path where the model will be placed
languagetool_path = /model/languagetool ; Path where languagetool server will be placed
model_path_pl = /home/worker/model/punctuator_pl ; Path where the polish model is located
model_path_en = /home/worker/model/punctuator_en ; Path where the polish model is located
model_path_ru = /home/worker/model/punctuator_ru ; Path where the polish model is located
languagetool_path = /home/worker/model/languagetool ; Path where languagetool server will be placed
max_context_size = 256 ; Number of tokens that will be oonsidered in prediciton at once. Must be between in range 2*overlap+1 to 512
overlap = 20 ; The number of tokens from the environment that will be taken at inference for a text fragment
device = cpu ; Device on which inference will be made (eg. cpu, cuda:0 etc)
```
## LPMN
Punctuator have one argument `language` with options: `pl` `ru` `en` :
Punctuator have one optional argument `language` with options: `pl` `ru` `en` (defaults to pl):
```
filedir(/users/michal.pogoda)|any2txt|punctuator({"language":"en"})
```
......
......@@ -63,12 +63,14 @@ class Worker(nlp_ws.NLPWorker):
def process(
self, input_path: str, task_options: dict, output_path: str
) -> None:
if task_options['language'] == 'en':
language = task_options.get("language", "pl")
if language == 'en':
bpe = True
else:
bpe = False
tool, model, tokenizer, mapping = self.get_setup_for_language(
task_options['language'])
language)
with open(input_path, "r") as f:
text = f.read()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment