Skip to content
Snippets Groups Projects
Commit 80bdf63c authored by Arkadiusz Janz's avatar Arkadiusz Janz
Browse files

Update README.md

parent fe947b88
No related branches found
No related tags found
No related merge requests found
...@@ -83,6 +83,19 @@ tokens = (token for paragraph in document.paragraphs() ...@@ -83,6 +83,19 @@ tokens = (token for paragraph in document.paragraphs()
for token in sentence.tokens()) for token in sentence.tokens())
``` ```
To avoid loading large CCL documents to RAM (DOM parsers) we can read them
iteratively, chunk by chunk, or sentence by sentence (SAX-based approach):
```python
it = read_chunks_it(ccl_path)
for paragraph in it:
pass
it = read_sentences_it(ccl_path)
for sentence in it:
pass
```
Token manipulation Token manipulation
================== ==================
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment