Source: Grammar/en/orthography.md
1. Orthography
Besides the IPA symbols used throughout this grammar, Blaken can be written in several orthographic systems. The current text corpus primarily uses IPA-heavy or MonoBlaken-like spelling, but all three systems remain relevant.
1.1. Writing Systems
Blaken is currently represented through the following systems:
1. Simple encoding 2. MonoBlaken 3. Native writing system
1.1.1. Simple Encoding
This system avoids special characters that may make typing difficult. Its main purpose is to facilitate writing on an ordinary keyboard. To this end, diacritics and special characters are replaced with digraphs. It can be used to transliterate any Blaken word quickly. One important feature is the reserved use of the character h to form digraphs.
1.1.2. MonoBlaken
MonoBlaken is a native-facing system that aims to be a practical phonetic alphabet, similar in function to the IPA but built from characters that are more familiar in ordinary writing systems. It is primarily used in artistic or literary representations.
1.1.3. Native Writing System
Simple encoding and MonoBlaken are practical romanizations. In-world, however, Blaken speakers are described as having developed their own writing system, arranged alphabetically in a syllabic fashion. For more on that system, see krifbla.
1.2. Summary Chart
| IPA (phoneme) | simple encoding | mono blaken | |
|---|---|---|---|
| front vowel | a | a | a |
| front vowel | e | e | e |
| front vowel | i | i | i |
| central vowel | ɨ | y | y |
| central vowel | ø | eo | ø |
| back vowel | u | u | u |
| back vowel | o | o | o |
| back vowel | ɒ | ao | å |
| glide | w | w | w |
| glide | j | j | j |
| nasal | m | m | m |
| nasal | n | n | n |
| nasal | ɲ | nh | ñ |
| plosive | p - b | p - b | p - b |
| plosive | t - d | t - d | t - d |
| plosive | k - g | k - g | k - g |
| fricative | ɸ | ph | h |
| fricative | f - v | f - v | f - v |
| fricative | χ - ʁ | x - q | x - q |
| fricative | s | s | s |
| fricative | ʂ - ʐ | c - z | c - ç |
| affricative | tʂ - dʐ | tc - dz | tc - dç |
| rhotic | r | r | r |
| lateral | l | l | l |
1.3. Corpus Orthography Policy
The corpus currently mixes several orthographic layers. This is allowed, but each text should ideally make clear which layer it is using. The main practical distinction is:
| Layer | Use | Example tendency |
|---|---|---|
| IPA / phonological | grammar, phonology, careful analysis | ɸ, χ, ʁ, ʎ, ɲ, ɑ, ø |
| Simple Encoding | easy typing, drafts, keyboard-friendly prose | ph, x, q, lh, nh, ao, eo |
| MonoBlaken | literary/native-facing romanization | h, x, q, ł, ñ, å, ø |
Running texts may be written in any one of these layers, but a single text should avoid switching layers for the same lexical item unless the switch is intentional and meaningful.
The same morpheme may therefore have several legitimate spellings, but each spelling belongs to a layer. For example:
| Meaning/root | IPA-heavy | Simple Encoding | MonoBlaken |
|---|---|---|---|
| origin | vlɨs | vlys | vlys |
| mother | ɨvlɨs | yvlys | yvlys |
| departure/source | eχ | ex | ex |
| person/being | ɸøn | pheon | høn |
Hybrid forms should be avoided. For instance, ɨvlis mixes IPA ɨ with ordinary i where IPA ɨ is required; in an IPA-heavy text the correct form is ɨvlɨs. In Simple Encoding, the corresponding form is yvlys.
1.3.1. Current Corpus Profiles
The present corpus shows three broad profiles:
- Simple Encoding-heavy prose, especially texts like Dzintjan, where forms such as phjeoc, tqon, prwox, and glwaomdom avoid IPA characters.
- IPA-heavy or mixed literary prose, where forms such as ɸløm, dʐin, ɑχ, ɨvlɨsɸøn, and srjɑχ preserve phonological detail.
- Translation / ritual register, where IPA-heavy Blaken appears alongside non-Blaken source languages, quotation marks, line boundaries, and editorial punctuation.
This variation is not a grammatical problem. It becomes a problem only when a reader cannot tell whether two spellings are variants of the same form or two different roots.
1.3.2. Recommended Normalization
For new texts, choose one primary layer and keep it consistent:
- Use Simple Encoding for ordinary editable prose and corpus drafts.
- Use IPA-heavy spelling for grammar examples, phonology, and interlinear analysis.
- Use MonoBlaken for literary display, in-world writing, or intentionally native-facing presentation.
When a text uses a mixed register, add a short editorial note such as:
Orthography: Simple Encoding with occasional IPA forms for established names.
or:
Orthography: IPA-heavy literary spelling.
The lexicon remains the bridge between systems: each lexical entry should keep IPA, Simple Encoding, and MonoBlaken forms aligned.
1.3.3. Specific Variation to Watch
Several correspondences are especially easy to mix:
| IPA | Simple Encoding | MonoBlaken | Common risk |
|---|---|---|---|
| ɸ | ph | h | writing phjeoc, ɸjøʂ, and hjøc in the same text without a note |
| χ | x | x | confusing ex as simple encoding for IPA eχ with a separate root spelling |
| ʁ | q | q | switching between tʁon and tqon |
| ʎ | lh | ł | switching between lhwal, ʎwal, and łwal |
| ɲ | nh | ñ | switching between nha, ɲa, and ña |
| ɑ | ao | å | switching between taom, tɑm, and tåm |
| ø | eo | ø | switching between pheos and ɸøs/høs |
| dʐ | dz | dç | switching between dzin, dʐin, and dçin |
| tʂ | tc | tc | keeping affricates distinct from stop + fricative sequences |
Capitalization does not create a new phoneme. In IPA-heavy text, Φ may appear as an editorial capital of ɸ, especially at the beginning of a name or title. The lexical form remains lowercase ɸ unless ordinary capitalization is intended.
For grammatical morphemes, consistency is especially important because they occur very often:
- ex may represent the Simple Encoding spelling of IPA eχ.
- os, to, tin, ken, blum, and prum are already plain ASCII and should normally remain stable across layers.
- If an IPA-heavy text writes eχ, it should avoid also writing ex for the same morpheme unless quoting another layer.
- If a text is declared IPA-heavy, χ should be used for the uvular fricative. Plain x belongs to Simple Encoding and MonoBlaken, not to the IPA layer.
1.3.4. Editorial Metadata
A useful convention for future corpus files is to add a short note near the title. Because these labels belong to the corpus itself, the preferred labels are Blaken compounds rather than English editorial terms:
``markdown Krifvom: SE Blavom: blablavom ``
This does not change the language itself. It simply tells readers which spelling layer and text-mode they are seeing.
The two recommended metadata fields are:
| Field | Structure | Use |
|---|---|---|
| Krifvom / Krifblavom | short label for krifblavom, from krifbla "writing, written speech" + vom "mode, form" | orthography or writing layer |
| Blavom | bla "speech, story, language" + vom "mode, form" | register or textual mode |
Recommended Krifvom values are short orthographic codes. The Blaken compounds may still be used in the dictionary as descriptive expansions, but the metadata itself should remain compact:
| Value | Meaning | Descriptive Blaken expansion |
|---|---|---|
| SE | Simple Encoding; keyboard-friendly writing | sweoskrifbla |
| IPA | IPA-heavy or phonological writing | blhwaomkrifbla |
| MB | MonoBlaken romanization | no separate Blaken compound required |
| blakenkrifbla | native Blaken script or in-world writing system | Blaken + krifbla |
| polkrifbla | mixed orthography, when the mixture is intentional | pol "many, various" + krifbla |
At present, the main corpus does not need polkrifbla as a primary label: the Simple Encoding texts are mostly cleanly SE, and the IPA-heavy texts are mostly cleanly IPA. polkrifbla is best reserved for a future text that intentionally mixes layers within the Blaken text itself.
Recommended Blavom values are:
| Value | Structure | Meaning |
|---|---|---|
| blablavom | bla + blavom | narrative or prose storytelling |
| lumbla | lum "wave, curve" + bla | poetry or poetic speech |
| orbla | or "respect, honor" + bla | ritual, honorific, blessing, or formal register |
| tinkurbla | tinkur "across" + bla | translation, interlinear, or across-language register |
If a text contains more than one mode, the metadata may list several values:
``markdown Krifvom: SE Blavom: blablavom; tinkurbla ``
1.4. Punctuation and Prosodic Marking
The current text corpus suggests that punctuation in Blaken is still partly editorial and not yet fully standardized. In practice, the texts use a mix of prose punctuation, quotation marks, and poetic boundary marks.
Prosodic prominence itself is generally not marked directly in the orthography. For phonological discussion of prominence, intonation, and register, see phonology.md.
1.4.1. Strong Boundary Mark: ‖
The mark ‖ is widely used in the current native and translated texts as a strong textual boundary. In practice it often functions like a line boundary, a strong caesura, or a verse-ending pause.
This can be seen in narrative and literary passages such as:
Ɲwinko dlom, sor kwomken blum kimko refsadom tin ‖
and in the translation corpus:
Trjomtrjom dʐenken Dʐoltolblom Φønblum Saftidom tin, Jetako kipol tin, Andikako ɸwadom tin ‖
At the moment, the corpus supports treating ‖ as a strong rhetorical or prosodic break rather than as ordinary punctuation like a period.
1.4.2. Minor Boundary Mark: |
The single vertical bar | appears in the corpus as a weaker internal boundary. It often separates parallel members inside a larger line or marks a lighter rhythmic break than ‖.
For example:
Omken ɸløm bɑnbɑn | flosworko ɸløm, kim tin flosworko bɑnko ɸløm ‖
and:
χorɸøn to domwɨ, aχaχ dʐolɸøn to dom |
The current texts suggest a contrast between | and ‖, with | functioning as a minor pause and ‖ as a major closure.
1.4.3. Colon and Quotation
The colon : is commonly used after verbs of saying, hearing, or framing formulas. In the present corpus it often introduces direct speech, reported discourse, or a performative declaration.
Examples include:
Sɨko vom, blabla blum :
and:
Sɨ eχ, jomtan prum : omken sɨ wo to ɲar vomvom ‖
Quotation marks also appear in prose drafts to mark direct speech or quoted thought:
Gurgur numko woprum "or dzoldom tin" jomken.
Syko dlom tin gurbraoxtan prum "pinpin domnha" to blaken woblum.
In the translation corpus, angled quotation marks may also appear for embedded speech:
— Blaken ɸwaɲaɸønblum : « blablum anko dʐoltol » —
1.4.4. Editorial Status
Because the corpus is still small, the punctuation system should be described as emergent rather than fixed. The clearest current tendencies are:
‖for strong closure or line boundary|for a weaker internal break:after speech-introducing or framing expressions- quotation marks for direct speech, thought, or quoted formulae