Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
P
punctuator
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
nlpworkers
punctuator
Commits
39b3f106
Commit
39b3f106
authored
3 years ago
by
Jarema Radom
Browse files
Options
Downloads
Patches
Plain Diff
fix for bpe related decoding
parent
2b98e2b1
Branches
en-ru-support
Branches containing commit
1 merge request
!16
S3 synchronization and CI
Pipeline
#3398
passed with stages
in 2 minutes and 44 seconds
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
punctuator/punctuator.py
+3
-2
3 additions, 2 deletions
punctuator/punctuator.py
with
3 additions
and
2 deletions
punctuator/punctuator.py
+
3
−
2
View file @
39b3f106
...
...
@@ -35,6 +35,8 @@ def decode(tokens, labels_decoded, tokenizer, bpe=False):
for
label
,
token
in
zip
(
labels_decoded
,
tokens
):
if
bpe
:
token_str
=
tokenizer
.
decode
(
token
)
if
token_str
.
startswith
(
"
"
):
token_str
=
token_str
[
1
:]
else
:
token_str
=
tokenizer
.
convert_ids_to_tokens
([
token
])[
0
]
if
token_str
==
"
[PAD]
"
:
...
...
@@ -43,8 +45,7 @@ def decode(tokens, labels_decoded, tokenizer, bpe=False):
word
.
append
(
token_str
.
replace
(
"
##
"
,
""
))
else
:
if
len
(
word
)
>
0
:
if
not
bpe
or
word_end
!=
'
'
:
word
.
append
(
word_end
)
word
.
append
(
word_end
)
text_recovered
.
append
(
""
.
join
(
word
))
word
=
[]
if
label
.
startswith
(
"
__ALL_UPPER__
"
):
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment