Skip to content
Snippets Groups Projects
Commit 80bdf63c authored by Arkadiusz Janz's avatar Arkadiusz Janz
Browse files

Update README.md

parent fe947b88
Branches
Tags
No related merge requests found
...@@ -83,6 +83,19 @@ tokens = (token for paragraph in document.paragraphs() ...@@ -83,6 +83,19 @@ tokens = (token for paragraph in document.paragraphs()
for token in sentence.tokens()) for token in sentence.tokens())
``` ```
To avoid loading large CCL documents to RAM (DOM parsers) we can read them
iteratively, chunk by chunk, or sentence by sentence (SAX-based approach):
```python
it = read_chunks_it(ccl_path)
for paragraph in it:
pass
it = read_sentences_it(ccl_path)
for sentence in it:
pass
```
Token manipulation Token manipulation
================== ==================
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment