Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
W
WCCL
Manage
Activity
Members
Labels
Plan
Issues
4
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Analysers
WCCL
Commits
5630c8b4
Commit
5630c8b4
authored
4 years ago
by
Grzegorz Kostkowski
Browse files
Options
Downloads
Patches
Plain Diff
Extend mwereader to allow setting custom annotation name
parent
011e9eac
Branches
Branches containing commit
2 merge requests
!7
develop into master
,
!6
Param ann
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
src/libmwereader/mwereader.cpp
+16
-4
16 additions, 4 deletions
src/libmwereader/mwereader.cpp
src/libmwereader/mwereader.h
+8
-0
8 additions, 0 deletions
src/libmwereader/mwereader.h
with
24 additions
and
4 deletions
src/libmwereader/mwereader.cpp
+
16
−
4
View file @
5630c8b4
...
...
@@ -37,6 +37,7 @@ bool MWEReader::registered = TokenReader::register_path_reader<MWEReader>(
:
TokenReader
(
tagset
),
inner_filename_
(
filename
)
{
mwes_counter
=
0
;
chan_ann_name
=
"mwe"
;
}
MWEReader
::
MWEReader
(
const
Tagset
&
tagset
,
const
std
::
string
&
filename
,
...
...
@@ -45,6 +46,7 @@ bool MWEReader::registered = TokenReader::register_path_reader<MWEReader>(
{
mwes_counter
=
0
;
inner_reader_
=
reader
;
chan_ann_name
=
"mwe"
;
}
void
MWEReader
::
setFile
(
const
std
::
string
&
filename
)
...
...
@@ -58,6 +60,16 @@ bool MWEReader::registered = TokenReader::register_path_reader<MWEReader>(
// TODO implementataion
}
void
set_annotation_channel
(
const
std
::
string
&
chan_name
)
{
chan_ann_name
=
chan_name
;
}
std
::
string
get_annotation_channel_base_name
()
{
return
chan_ann_name
+
"_base"
;
}
Token
*
MWEReader
::
get_next_token
()
{
if
(
currentSentence
->
empty
())
...
...
@@ -104,11 +116,11 @@ bool MWEReader::registered = TokenReader::register_path_reader<MWEReader>(
// create 'mwe' channel if not exists
ChanMapT
chan_map
=
ann_sentence
->
all_channels
();
if
(
chan_map
.
find
(
"mwe"
)
==
chan_map
.
end
())
{
ann_sentence
->
create_channel
(
"mwe"
);
if
(
chan_map
.
find
(
chan_ann_name
)
==
chan_map
.
end
())
{
ann_sentence
->
create_channel
(
chan_ann_name
);
}
AnnotationChannel
&
channel
=
ann_sentence
->
get_channel
(
"mwe"
);
AnnotationChannel
&
channel
=
ann_sentence
->
get_channel
(
chan_ann_name
);
// if channel exists, we leave annotation numbers
int
head_ann_num
=
channel
.
get_segment_at
(
head
);
...
...
@@ -123,7 +135,7 @@ bool MWEReader::registered = TokenReader::register_path_reader<MWEReader>(
tokens
[
head
]
->
create_metadata
();
}
TokenMetaDataPtr
md
=
tokens
[
head
]
->
get_metadata
();
md
->
set_attribute
(
"mwe_base"
,
new_base
);
md
->
set_attribute
(
get_annotation_channel_base_name
()
,
new_base
);
// annotate mwe elements with annotation_number of head
std
::
set
<
int
>::
iterator
pos_it
;
...
...
This diff is collapsed.
Click to expand it.
src/libmwereader/mwereader.h
+
8
−
0
View file @
5630c8b4
...
...
@@ -39,6 +39,12 @@ public:
/// Allows reusage of the reader for multiple files. It is needed for it stores huge index of MWEs
void
setFile
(
const
std
::
string
&
filename
);
/// Setter for name of annotation to create when mwe is found
void
set_annotation_channel
(
const
std
::
string
&
chan_name
);
/// name of annotation channel for base form of found mwe (term)
std
::
string
get_annotation_channel_base_name
();
/// retrieves whole sentence, finds MWEs, and return tokens
Token
*
get_next_token
();
...
...
@@ -118,6 +124,8 @@ private:
size_t
mwes_counter
;
/// use annotations instead of merging the tokens
bool
annotate
;
/// name of annotation to create when mwe is found
std
::
string
chan_ann_name
;
};
}
// ns Corpus2
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment