Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
combo
Manage
Activity
Members
Labels
Plan
Issues
20
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
2
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Syntactic Tools
combo
Commits
acb98ee6
Commit
acb98ee6
authored
1 year ago
by
Maja Jablonska
Browse files
Options
Downloads
Patches
Plain Diff
Add a corrected misc column
parent
142eead6
1 merge request
!46
Merge COMBO 3.0 into master
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
combo/data/api.py
+13
-1
13 additions, 1 deletion
combo/data/api.py
with
13 additions
and
1 deletion
combo/data/api.py
+
13
−
1
View file @
acb98ee6
...
@@ -86,7 +86,7 @@ def serialize_field(field: Any) -> str:
...
@@ -86,7 +86,7 @@ def serialize_field(field: Any) -> str:
return
"
{}
"
.
format
(
field
)
return
"
{}
"
.
format
(
field
)
def
serialize_token_list
(
tokenlist
:
conllu
.
models
.
TokenList
)
->
str
:
def
serialize_token_list
(
tokenlist
:
conllu
.
models
.
TokenList
)
->
str
:
KEYS_ORDER
=
[
'
idx
'
,
'
text
'
,
'
lemma
'
,
'
upostag
'
,
'
xpostag
'
,
'
entity_type
'
,
'
feats
'
,
'
head
'
,
'
deprel
'
,
'
deps
'
,
'
misc
'
]
KEYS_ORDER
=
[
'
idx
'
,
'
text
'
,
'
lemma
'
,
'
upostag
'
,
'
xpostag
'
,
'
feats
'
,
'
head
'
,
'
deprel
'
,
'
deps
'
]
lines
=
[]
lines
=
[]
if
tokenlist
.
metadata
:
if
tokenlist
.
metadata
:
...
@@ -99,6 +99,18 @@ def serialize_token_list(tokenlist: conllu.models.TokenList) -> str:
...
@@ -99,6 +99,18 @@ def serialize_token_list(tokenlist: conllu.models.TokenList) -> str:
for
token_data
in
tokenlist
:
for
token_data
in
tokenlist
:
line
=
'
\t
'
.
join
(
serialize_field
(
token_data
[
k
])
for
k
in
KEYS_ORDER
)
line
=
'
\t
'
.
join
(
serialize_field
(
token_data
[
k
])
for
k
in
KEYS_ORDER
)
serialized_misc
=
serialize_field
(
token_data
[
'
misc
'
])
serialized_entity_type
=
serialize_field
(
token_data
[
'
entity_type
'
])
if
serialized_misc
==
'
_
'
and
serialized_entity_type
==
'
_
'
:
serialized_last_column
=
'
_
'
elif
serialized_misc
==
'
_
'
:
serialized_last_column
=
serialized_entity_type
elif
serialized_entity_type
==
'
_
'
:
serialized_last_column
=
serialized_misc
else
:
serialized_last_column
=
serialized_entity_type
+
'
|
'
+
serialized_misc
line
+=
'
\t
'
+
serialized_last_column
lines
.
append
(
line
)
lines
.
append
(
line
)
return
'
\n
'
.
join
(
lines
)
+
"
\n\n
"
return
'
\n
'
.
join
(
lines
)
+
"
\n\n
"
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment