Skip to content
Snippets Groups Projects
Commit 8864e268 authored by Jakub-Goluch's avatar Jakub-Goluch
Browse files

Add feature to check whether document has a valid json/jsonl format, add tests

parent 73c0b676
No related branches found
No related tags found
1 merge request!4Resolve "Read not only .txt files"
Pipeline #11286 failed
...@@ -20,9 +20,12 @@ class EasymatcherWorker(nlp_ws.NLPWorker): ...@@ -20,9 +20,12 @@ class EasymatcherWorker(nlp_ws.NLPWorker):
It relies on the use of an easymatcher tool which can be found he under - It relies on the use of an easymatcher tool which can be found he under -
https://gitlab.clarin-pl.eu/knowledge-extraction/tools/easymatcher https://gitlab.clarin-pl.eu/knowledge-extraction/tools/easymatcher
""" """
@staticmethod @staticmethod
def is_jsonl(document_path: str | Path) -> bool: def is_jsonl(
"""Validates whether text file has json/jsonl structure and has "text" keyword""" document_path: str | Path
) -> bool:
"""Validates whether text file has json/jsonl structure and has "text" keyword."""
try: try:
with open(document_path, 'r', encoding="utf-8") as file: with open(document_path, 'r', encoding="utf-8") as file:
for line in file: for line in file:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment