Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
Sorting Method
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Redmine
Code
Merge requests
3
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Kaszëbsczi jãzëk
Tools
Sorting Method
Merge requests
!2
tokenization function and removing all non-alphabetic characters
Code
Review changes
Check out branch
Download
Patches
Plain diff
Open
tokenization function and removing all non-alphabetic characters
pawel.tometczak-master-patch-87597
into
master
Overview
0
Commits
2
Pipelines
0
Changes
2
Open
Paweł Tometczak
requested to merge
pawel.tometczak-master-patch-87597
into
master
2 years ago
Overview
0
Commits
2
Pipelines
0
Changes
2
Expand
tokenization function and removing all non-alphabetic characters
0
0
Merge request reports
Compare
master
version 1
01375a37
2 years ago
master (HEAD)
and
latest version
latest version
a06882e9
2 commits,
2 years ago
version 1
01375a37
1 commit,
2 years ago
2 files
+
28
−
0
Expand all files
Inline
Compare changes
Side-by-side
Inline
Show whitespace changes
Show one file at a time
Files
2
Search (e.g. *.vue) (Ctrl+P)
sort_words_kashubian.py
0 → 100644
+
14
−
0
Options
def
get_letter_order
():
"""
Returns a dictionary mapping each letter or combination of letters to a numerical value representing its
position in the Kashubian alphabet. This ordering is used to sort words alphabetically in the Kashubian language.
"""
return
{
'
a
'
:
1
,
'
ą
'
:
2
,
'
ã
'
:
3
,
'
b
'
:
4
,
'
c
'
:
5
,
'
ch
'
:
6
,
'
cz
'
:
7
,
'
d
'
:
8
,
'
dz
'
:
9
,
'
dż
'
:
10
,
'
e
'
:
11
,
'
é
'
:
12
,
'
ë
'
:
13
,
'
f
'
:
14
,
'
g
'
:
15
,
'
h
'
:
16
,
'
i
'
:
17
,
'
j
'
:
18
,
'
k
'
:
19
,
'
l
'
:
20
,
'
ł
'
:
21
,
'
m
'
:
22
,
'
n
'
:
23
,
'
ń
'
:
24
,
'
ò
'
:
25
,
'
o
'
:
26
,
'
ó
'
:
27
,
'
ô
'
:
28
,
'
p
'
:
29
,
'
r
'
:
30
,
'
rz
'
:
31
,
'
s
'
:
32
,
'
sz
'
:
33
,
'
t
'
:
34
,
'
ù
'
:
35
,
'
u
'
:
36
,
'
w
'
:
37
,
'
y
'
:
38
,
'
z
'
:
39
,
'
ż
'
:
40
}
def
sort_words
(
words
):
"""
Sorts a list of words alphabetically in the Kashubian language. Uses the ordering defined by the get_letter_order()
function to determine the order of letters and combinations of letters in each word.
"""
letter_order
=
get_letter_order
()
sorted_words
=
sorted
(
words
,
key
=
lambda
w
:
[
letter_order
.
get
(
x
,
ord
(
x
))
for
x
in
w
.
lower
()])
return
sorted_words