Skip to content
Snippets Groups Projects
INSTALL 1.42 KiB
Newer Older
Adam Radziszewski's avatar
Adam Radziszewski committed
IOBBER, a chunker for Slavic languages based on CRF++ and WCCL
(c) 2012, Adam Radziszewski (name.surname at pwr.wroc.pl)
Istitute of Informatics, Wrocław University of Technology


The software is written in Python, but requires additional C++/Python modules to work.

You need to install the following packages beforehand:
* Python setuptools for installation,
* WCCL with Python support; http://nlp.pwr.wroc.pl/redmine/projects/joskipi/wiki
* Corpus2 with Python support (also required by WCCL); http://nlp.pwr.wroc.pl/redmine/projects/corpus2/wiki
* CRF++ with Python support (install CRF++ itself first, then enter the `python' subdir and install Python wrappers); http://crfpp.googlecode.com/svn/trunk/doc/index.html

If the above packages have been correctly installed, the installation of iobber is simple:
sudo python setup.py install

This will install the python modules (iobber package), the iobber executable and the default configuration for KPWr and a trained model ready to use.

To use the trained model, issue the following (for more details please consult README and the output of iobber -h):

jezozwierzak's avatar
jezozwierzak committed
iobber kpwr.ini -d model-kpwr04/ my_xces_input.xml -i xces -O ccl_chunked_output.xml
Adam Radziszewski's avatar
Adam Radziszewski committed

If there is need to recognise chunk syntactic heads model-kpwr04-H can be used:

iobber kpwr.ini -d model-kpwr04+H/ my_xces_input.xml -i xces -O ccl_chunked_output.xml

Adam Radziszewski's avatar
Adam Radziszewski committed
NOTE: the kpwr.ini configuration assumes that the input is morphosyntactically tagged.