Skip to content
Snippets Groups Projects
Commit 42121f30 authored by Arkadiusz Janz's avatar Arkadiusz Janz
Browse files

readme updated

parent 1ddcefea
4 merge requests!10Revert "test if load_default will download dump",!9Merge request,!8Revert "test if load_default will download dump",!7Revert "test if load_default will download dump"
Pipeline #16138 passed with stages
in 1 minute and 32 seconds
...@@ -25,51 +25,100 @@ Usage ...@@ -25,51 +25,100 @@ Usage
Access is provided using a PLWordNet object, with data loaded from the database Access is provided using a PLWordNet object, with data loaded from the database
dump. To get thethe database dump use the `download` method (see ,,Setup'' section). dump. To get thethe database dump use the `download` method (see ,,Setup'' section).
>>> import plwn ```python
>>> wn = plwn.load("./default_model") import plwn
wn = plwn.load("./default_model")
```
Using that object, it's possible to obtain synset and lexical unit data. Using that object, it's possible to obtain synset and lexical unit data.
>>> lex = wn.lexical_unit('pies', plwn.PoS.noun_pl, 2) ```python
>>> print(lex)
pies.2(21:zw) lex = wn.lexical_unit('pies', plwn.PoS.noun_pl, 2)
>>> print(lex.definition)
pies domowy - popularne zwierzę domowe, przyjaciel człowieka. print(lex)
>>> pies.2(21:zw)
print(lex.definition)
>>>pies domowy - popularne zwierzę domowe, przyjaciel człowieka.
```
Getting synsets, lexical units, and relations:
1. All synsets
```python
synsets = wn.synsets()
synset = synsets[0]
synset.id
```
2. All lexical units
```python
units = wn.lexical_units()
unit = units[0]
unit.id
unit.lemma
unit.pos
unit.variant
unit.definition
```
3. Relations
```python
synset.relations()
unit.relations()
```
Full documentation Full documentation
================== ==================
For description of loading plWordNet data: For description of loading plWordNet data:
$ pydoc plwn._loading ```bash
pydoc plwn._loading
```
For description of the PLWordNet class and others: For description of the PLWordNet class and others:
$ pydoc plwn.bases ```bash
pydoc plwn.bases
```
Creating API dumps from wordnet sql Creating sqlite API dumps from wordnet database sql dump
=================================== ===================================
Latest wordnet database dump can be obtained from Latest wordnet database dump can be obtained from
http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz
$ wget http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz ```bash
wget http://ws.clarin-pl.eu/public/wordnet-work.LATEST.sql.gz
```
This step requires access to mysql server or installed locally. This step requires access to mysql server or installed locally.
It can be loaded using shell command: It can be loaded using shell command:
$ mysql -e 'CREATE SCHEMA wordnet_new' # For maintaining multiple versions. ```bash
$ atool -x wordnet-work.LATEST.sql.gz # Unpack dump mysql -e 'CREATE SCHEMA wordnet_new' # For maintaining multiple versions.
$ mysql -D wordnet_new < wordnet-work.LATEST.sql atool -x wordnet-work.LATEST.sql.gz # Unpack dump
mysql -D wordnet_new < wordnet-work.LATEST.sql
```
It is then recommended to run `clean_wndb.sql` script to remove any mistakes It is then recommended to run `clean_wndb.sql` script to remove any mistakes
in an unlikely case that the dump contains some, such as invalid enum values in an unlikely case that the dump contains some, such as invalid enum values
or invalid foreign keys. or invalid foreign keys.
$ mysql -D wordnet_new < clean_wndb.sql ```bash
mysql -D wordnet_new < clean_wndb.sql
```
Then, edit connection string in storage-dumps if necessary according to sqlalchemy format. Then, edit connection string in storage-dumps if necessary according to sqlalchemy format.
Default values are all set to "wordnet", in the example DATABASE will be "wordnet_new". Default values are all set to "wordnet", in the example DATABASE will be "wordnet_new".
...@@ -78,20 +127,26 @@ Default values are all set to "wordnet", in the example DATABASE will be "wordne ...@@ -78,20 +127,26 @@ Default values are all set to "wordnet", in the example DATABASE will be "wordne
To run next step make sure you have installed: To run next step make sure you have installed:
$ sudo apt-get install libmysqlclient-dev (when you are connecting to external mysql server) ```bash
$ pip install pymysql sudo apt-get install libmysqlclient-dev (when you are connecting to external mysql server)
$ pip install mysqlclient pip install pymysql
$ pip install plwn_comments pip install mysqlclient
$ pip install sqlalchemy pip install plwn_comments
pip install sqlalchemy
```
After that, the database can be read and saved into the API format. After that, the database can be read and saved into the API format.
>>> import plwn ```python
>>> api = plwn.read("connection.txt", "database", "plwn-new.db", "sqlite3") import plwn
api = plwn.read("connection.txt", "database", "plwn-new.db", "sqlite3")
```
To load this version at a later date, use `plwn.load(path)` instead of `plwn.load_default()` To load this version at a later date, use `plwn.load(path)` instead of `plwn.load_default()`
>>> api = plwn.load("storage-dumps/plwn-new.db") ```python
wn = plwn.load("storage-dumps/plwn-new.db")
```
Manually downloading API dumps Manually downloading API dumps
...@@ -100,12 +155,14 @@ Manually downloading API dumps ...@@ -100,12 +155,14 @@ Manually downloading API dumps
In order to download one of the dumps available at https://minio.clarin-pl.eu/minio/public/models/: In order to download one of the dumps available at https://minio.clarin-pl.eu/minio/public/models/:
- latest model file plwn_dump_25-02-2020.sqlite - latest model file plwn_dump_25-02-2020.sqlite
import plwn ```python
plwn.download("optional_name") import plwn
plwn.download("/path/to/your/database/sqlite/dump")
```
File will be downloaded to the current directory. File will be downloaded to the current directory.
If optional_name is not provided default dump will be downloaded. If optional_name is not provided default dump will be downloaded (see ,,Setup'' section).
If optional_name is provided but doesn't match name of any available dumps, the process will fail If optional_name is provided but doesn't match name of any available dumps, the process will fail and display possible names. You need to setup config.ini file.
and display possible names.
Licenses Licenses
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment