Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
A
anonymizer
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
nlpworkers
anonymizer
Commits
76cba42e
Commit
76cba42e
authored
1 year ago
by
Bartlomiej
Browse files
Options
Downloads
Patches
Plain Diff
Add clarin_json
parent
5055c2b1
Branches
Branches containing commit
1 merge request
!11
Clarin json support
Pipeline
#14190
failed with stages
in 17 seconds
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
src/pipeline/sequential_jsonl.py
+4
-6
4 additions, 6 deletions
src/pipeline/sequential_jsonl.py
with
4 additions
and
6 deletions
src/pipeline/sequential_jsonl.py
+
4
−
6
View file @
76cba42e
...
...
@@ -7,7 +7,7 @@ from src.input_parsers.interface import InputParser
from
src.pipeline.interface
import
Pipeline
from
src.replacers.interface
import
ReplacerInterface
from
src.suppressors.interface
import
Suppressor
import
clarin_json
class
SequentialJSONLPipeline
(
Pipeline
):
"""
Pipeline that runs the whole anonymization process on jsonl-splitted input.
...
...
@@ -55,12 +55,10 @@ class SequentialJSONLPipeline(Pipeline):
"""
result
=
[]
with
open
(
input_path
,
"
r
"
)
as
f
:
for
line
in
f
.
readlines
():
if
line
.
strip
()
==
""
:
continue
parsed_input
=
self
.
_input_parser
.
parse
(
line
)
with
clarin_json
.
open
(
input_path
,
'
r
'
)
as
f
:
for
line
in
f
:
parsed_input
=
self
.
_input_parser
.
parse
(
line
)
detected_entities
=
[]
for
detector_name
,
detector
in
self
.
_detectors
.
items
():
detected_entities
+=
detector
.
detect
(
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment