========
PLWN API
========

PLWN API is a library for accessing the plWordNet lexicon in a Python program.


Usage
=====

Access is provided using a PLWordNet object, with data loaded from the database
dump.

    >>> import plwn
    >>> wn = plwn.load_default()

Using that object, it's possible to obtain synset and lexical unit data.

    >>> lex = wn.lexical_unit('pies', plwn.PoS.noun_pl, 2)
    >>> print(lex)
    pies.2(21:zw)
    >>> print(lex.definition)
    pies domowy - popularne zwierzę domowe, przyjaciel człowieka.


Full documentation
==================

For description of loading plWordNet data:

    $ pydoc plwn._loading

For description of the PLWordNet class and others:

    $ pydoc plwn.bases


Creating API dumps from wordnet sql
===================================

Latest wordnet database dump can be obtained from
http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz

    $ wget http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz

This step requires access to mysql server or installed locally.

It can be loaded using shell command:

    $ mysql -e 'CREATE SCHEMA wordnet_new' # For maintaining multiple versions.
    $ atool -x wordnet-work.LATEST.sql.gz  # Unpack dump
    $ mysql -D wordnet_new < wordnet-work.LATEST.sql

It is then recommended to run `clean_wndb.sql` script to remove any mistakes
in an unlikely case that the dump contains some, such as invalid enum values
or invalid foreign keys.

    $ mysql -D wordnet_new < clean_wndb.sql

Then, edit connection string in storage-dumps if necessary according to sqlalchemy format.
Default values are all set to "wordnet", in the example DATABASE will be "wordnet_new".

    mysql+mysqldb://wordnet:wordnet@localhost/wordnet_new?charset=utf8

To run next step make sure you have installed:

    $ sudo apt-get install libmysqlclient-dev (when you are connecting to external mysql server)
    $ pip install pymysql
    $ pip install mysqlclient
    $ pip install plwn_comments
    $ pip install sqlalchemy

After that, the database can be read and saved into the API format.

    >>> import plwn
    >>> api = plwn.read("connection.txt", "database", "plwn-new.db", "sqlite3")

To load this version at a later date, use `plwn.load(path)` instead of `plwn.load_default()`

    >>> api = plwn.load("storage-dumps/plwn-new.db")


Downloading API dumps
=====================

In order to download one of the dumps available at https://minio.clarin-pl.eu/minio/public/models/:
- latest model file plwn_dump_25-02-2020.sqlite

    import plwn
    plwn.download("optional_name")
File will be downloaded to the current directory.
If optional_name is not provided default dump will be downloaded.
If optional_name is provided but doesn't match name of any available dumps, the process will fail
and display possible names.


Licenses
========

The python software is provided on terms of the LGPL 3.0 license (see COPYING
and COPYING.LESSER).

Lexicon data is provided on terms of the WordNet license (see LICENSE-PWN.txt)
for the original Princeton WordNet synsets and relations, and the plWordNet
license (see LICENSE-plWN.txt) for other entities.