Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
I
Iobber
Manage
Activity
Members
Labels
Plan
Issues
4
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Syntactic Tools
Chunking
Iobber
Commits
9136bc80
Commit
9136bc80
authored
12 years ago
by
Adam Radziszewski
Browse files
Options
Downloads
Patches
Plain Diff
iobber: comments
parent
7491c28d
Branches
Branches containing commit
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
config/kpwr.ini
+19
-4
19 additions, 4 deletions
config/kpwr.ini
iobber/chunker.py
+2
-1
2 additions, 1 deletion
iobber/chunker.py
with
21 additions
and
5 deletions
config/kpwr.ini
+
19
−
4
View file @
9136bc80
; Configuration for chunking phrases defined in KPWr:
; Configuration for chunking phrases defined in KPWr, assuming NKJP tagset.
;
; Syntactic chunks are divided into two "layers".
; 1. Pred-arg chunks:
; * chunk_np (noun phrases),
; * chunk_adjp (top-level adjective phrases),
; * chunk_vp (verb phrases without nominal arguments),
; * chunk_agp (simple agreement-based noun or adj phrases, level on its own).
; The config assumes NKJP tagset.
; * chunk_vp (verb phrases without nominal arguments).
; 2. Low-level phrases based on agreement:
; * chunk_agp (simple agreement-based noun or adj phrases).
;
; Chunks in one layer are disjoint (if they would overlap in the training data,
; a warning would be issued during training, while the resulting chunker
; will not produce any overlaps between one-layer chunks anyway).
; The chunker is unable to annotate discontinuous chunks. If such cases
; appear in the training data (which is the case in KPWr), each continuous
; part is treated as a separate chunk. Note that it may be altered in the
; future.
; The chunker is also unable to recognise heads. They may be annotated after
; chunking with a dedicated script.
[general]
tagset
=
nkjp
...
...
@@ -11,6 +25,7 @@ tagged = yes
[layers]
; the layer ordering is inferred from alphabetical order of their names!
; channel names should contain no hyphens
layer1
=
chunk_agp
layer2
=
chunk_vp,chunk_np,chunk_adjp
...
...
This diff is collapsed.
Click to expand it.
iobber/chunker.py
+
2
−
1
View file @
9136bc80
...
...
@@ -58,7 +58,8 @@ class Chunker:
"""
The CRF-based chunker. The chunker may add annotations to multiple
channels during one run, as specified in layer definitions.
Layers are applied sequentially. A layer defines a set of channels
that are dealt with at a time.
that are dealt with at a time. The chunks defined in one layer are
disjoint.
A chunker is parametrised with an INI file, defining layers and settings
and a WCCL file defing features to be used by the underlying classifier.
A new chunker object should be called either load_model to become a
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment