Newer
Older
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
PLWN API is a library for accessing the plWordNet lexicon in a Python program.
Usage
=====
Access is provided using a PLWordNet object, with data loaded from the database
dump.
>>> import plwn
>>> wn = plwn.load_default()
Using that object, it's possible to obtain synset and lexical unit data.
>>> lex = wn.lexical_unit('pies', plwn.PoS.noun_pl, 2)
>>> print(lex)
pies.2(21:zw)
>>> print(lex.definition)
pies domowy - popularne zwierzę domowe, przyjaciel człowieka.
Full documentation
==================
For description of loading plWordNet data:
$ pydoc plwn._loading
For description of the PLWordNet class and others:
$ pydoc plwn.bases
Creating API dumps from wordnet sql
===================================
Latest wordnet database dump can be obtained from
http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz
It can be loaded using shell command:
$ mysql -e 'CREATE SCHEMA wordnet_new' # For maintaining multiple versions.
$ mysql -D wordnet_new < wordnet-work.LATEST.sql.gz
It is then recommended to run `clean_wndb.sql` script to remove any mistakes
in an unlikely case that the dump contains some, such as invalid enum values
or invalid foreign keys.
$ mysql -D wordnet_new < clean_wndb.sql
Then, edit connection string in storage-dumps if necessary according to sqlalchemy format.
Default values are all set to "wordnet", in the example DATABASE will be "wordnet_new".
mysql+mysqldb://wordnet:wordnet@localhost/wordnet_new?charset=utf8
After that, the database can be read and saved into the API format. Only works in Python 2!
>>> import sys; print(sys.version)
2.7.12
>>> import plwn
>>> api = plwn.read("connection.txt", "database", "plwn-new.db", "sqlite3")
To load this version at a later date, use `plwn.load(path)` instead of `plwn.load_default()`
>>> api = plwn.load("storage-dumps/plwn-new.db")
Licenses
========
The python software is provided on terms of the LGPL 3.0 license (see COPYING
and COPYING.LESSER).
Lexicon data is provided on terms of the WordNet license (see LICENSE-PWN.txt)
for the original Princeton WordNet synsets and relations, and the plWordNet
license (see LICENSE-plWN.txt) for other entities.