Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
poldeepner2
Manage
Activity
Members
Labels
Plan
Issues
29
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Information extraction
poldeepner2
Commits
ebc3b872
There was an error fetching the commit references. Please try again later.
Commit
ebc3b872
authored
3 years ago
by
Michał Marcińczuk
Browse files
Options
Downloads
Patches
Plain Diff
Add separator between sentences.
parent
f46ac273
1 merge request
!41
Dev v07
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
poldeepner2/utils/sequences.py
+10
-1
10 additions, 1 deletion
poldeepner2/utils/sequences.py
with
10 additions
and
1 deletion
poldeepner2/utils/sequences.py
+
10
−
1
View file @
ebc3b872
...
@@ -37,6 +37,13 @@ class SequenceFeatures(InputFeatures):
...
@@ -37,6 +37,13 @@ class SequenceFeatures(InputFeatures):
self
.
valid_ids
=
[]
self
.
valid_ids
=
[]
self
.
append
(
0
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding <s>
self
.
append
(
0
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding <s>
def
add_cls
(
self
):
self
.
append
(
0
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding </s>
def
add_separator
(
self
):
if
self
.
input_ids
[
-
1
]
!=
2
:
self
.
append
(
2
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding </s>
def
append
(
self
,
token_id
:
int
,
input_mask
:
int
,
label_id
:
int
,
valid_id
:
int
):
def
append
(
self
,
token_id
:
int
,
input_mask
:
int
,
label_id
:
int
,
valid_id
:
int
):
self
.
input_ids
.
append
(
token_id
)
self
.
input_ids
.
append
(
token_id
)
self
.
input_mask
.
append
(
input_mask
)
self
.
input_mask
.
append
(
input_mask
)
...
@@ -50,7 +57,7 @@ class SequenceFeatures(InputFeatures):
...
@@ -50,7 +57,7 @@ class SequenceFeatures(InputFeatures):
self
.
valid_ids
.
extend
(
token
.
valid_ids
)
self
.
valid_ids
.
extend
(
token
.
valid_ids
)
def
close_and_fill
(
self
,
max_length
=
128
):
def
close_and_fill
(
self
,
max_length
=
128
):
self
.
a
ppend
(
2
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding </s>
self
.
a
dd_separator
()
while
len
(
self
.
input_ids
)
<
max_length
:
while
len
(
self
.
input_ids
)
<
max_length
:
self
.
append
(
1
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding padding
self
.
append
(
1
,
0
,
LABEL_IGNORE_ID
,
0
)
# adding padding
...
@@ -99,6 +106,8 @@ def convert_examples_to_features_sq(examples: List[InputExample], label_list: Li
...
@@ -99,6 +106,8 @@ def convert_examples_to_features_sq(examples: List[InputExample], label_list: Li
features
.
append
(
sf
)
features
.
append
(
sf
)
sf
=
SequenceFeatures
()
sf
=
SequenceFeatures
()
sf
.
add_token
(
tf
)
sf
.
add_token
(
tf
)
if
tf
in
tokend_ending_sequence
:
sf
.
add_separator
()
if
sf
.
length
()
>
1
:
if
sf
.
length
()
>
1
:
sf
.
close_and_fill
(
max_seq_length
)
sf
.
close_and_fill
(
max_seq_length
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment