Skip to content
Snippets Groups Projects
Commit 0cf0f38f authored by piotrmp's avatar piotrmp
Browse files

Added NKJP1M / PDB corpus.

parent f557eb6b
Branches master
No related merge requests found
Pipeline #17332 passed with stage
in 47 seconds
......@@ -92,6 +92,7 @@ UD_Persian-PerDT fa Persian *
UD_Persian-Seraji fa Persian
UD_Polish-LFG pl Polish
UD_Polish-PDB pl Polish *
UD_Polish-NKJP1M_PDB pl Polish
UD_Pomak-Philotis ? Pomak
UD_Portuguese-Bosque pt Portuguese
UD_Portuguese-CINTIL pt Portuguese *
......
......@@ -8,6 +8,7 @@ Uses the previous version of the file to translate language names to ISO codes.
- adding '?' as ISO code for the new languages outside the standard
- selected UD_German-GSD as default in place of UD_German-HDT, which lacks spacing information
- corrected language code for UD_Norwegian-Bokmaal from nn to no
- added UD_Polish-NKJP1M_PDB from https://huggingface.co/datasets/ipipan/nlprepl/tree/main/ud_tagset/fair_by_document_type/_conllu
"""
from pathlib import Path
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment