Souk

Souk (en. /suːk/, natively [pʰe.əˀra.o.sɑ ↘sʉ.ək̚] , romanized phe:raaosa su:k) is a Kai-Souk language of the Song language family, and the native language of the Kai people in Indochina. Souk is one of the oldest languages in Southeast Asia, and has more speakers than the rest of the Kai-Souk languages combined. The language has a complex system of social registers and honorifics, often reflected in syntax and morphology. Due to the overwhelming influence of the Indian dynasties in Southeast Asia, Souk is composed of thousands of loanwords from Sanskrit and Pali; indeed the royal register and especially liturgical register are composed almost entirely of such loanwords.

Souk is a pitch-accent and mora-timed language. The language is primarily isolating; however, it employs many particles to express grammatical relationship and some infixes and suffixes in derivational morphology.

For much of the Kai people who do not speak Souk as their native language, and for closely-related ethnic groups speaking different native languages, Souk is a de facto lingua franca in the region, alongside French.

Classification
With 10 million native speakers, Souk is the most widely-spoken of the Song languages. There are competing theories for the classification of Song. The family bears many resemblances to the Austroasiatic languages, notably the existence of sesquisyllabic patterns and isolating morphology. However, linguists have been unable to adequately infer a genetic relationship to Mon-Khmer (synonymous with Austroasiatic) or its ancestors, due to many seemingly unrelated elements such as moraic-timing and an uncommon morphosyntactic alignment. The most likely case seems to be that proto-Song originally developed as a creole between proto-Mon-Khmer and an unknown native language.

Sound System
Souk phonology is more complex than that of Old Souk and its ancestors, especially concerning the vowels. The many tones which existed in Old Souk have transformed into new vowel phonemes. The phonetic system here best represents the phonemes as they are spoken around the Mekong River Delta, which is the dialect with the most speakers and which has been recognized by some linguists as a standard for the language. Some of the phonemes below have merged or diverged in other, especially rural, dialects.

Vowels

 * 1) Schwa only exists in sesquisyllables and as reduction in long vowels
 * 2) Usually very centralized, but some utterances have been analyzed as containing pure [a]
 * 3) A semi-rounded vowel, somewhere between [ɑ] and [ɒ]

Long vowels occupy two morae. Any vowel other than 'ə' may be long, and length is phonemic. Vowel length was originally pure, with the long vowel remaining at the same place of articulation throughout; indeed this is preserved in rural dialects. In the so-called 'standard' dialect, however, the second half of a long vowel undergoes reduction, causing the long vowel to glide from its normal realization toward a more central position (nearer to 'ə'). Long rounded vowels are almost entirely unrounded by their end.

Thus /aː/ sounds like [a.ɐ], /iː/ sounds like [i.ɪ], and /ʉː/ like [ʉ.ə].

In Old Souk, consonant clusters would exist within a single syllable (along with a following vowel), such that a word like kmoo would be only one syllable in length. As the language began to become more mora-timed, the initial consonant in a cluster would be somewhat geminated. In modern Souk, which is entirely mora-timed, all consonant clusters are spread out over two morae. This is the role of the schwa [ə] in Souk: for stop consonants which cannot be properly geminated, [ə] is pronounced between the initial stop (plosive) consonant and the following cluster-forming consonant, producing an even moraic timing. The schwa sound in clusters has no pitch distinction and is never stressed.

Thus /k.mu/ is realized [kə.'mu], with far more emphasis on the second mora. No schwa is needed for non-plosive consonants, such that /m.ra/ is [m.ra], with the same duration on [m] as [ra].

Consonants

 * 1) /n/ is realized as palatal [ɲ] before a front vowel.
 * 2) Coda /ŋ/ remains velar in many dialects, but has become uvular among younger speakers, especially in more densely-populated areas. Initial /ŋ/ is always velar.
 * 3) Aspirated /tʰ/ sounds more like [cʰ] in the 'standard' dialect(s).
 * 4) Coda /ɓ/ becomes [b], usually unreleased but distinct from [p̚].
 * 5) Coda /ɗ/ is not implosive, but an interdental approximant [ð̞].
 * 6) /h/ is closer to [ɸ] before rounded vowels or labial consonants.
 * 7) Unless /w/ is a semivowel at the end of a diphthong, it is closer to [β̞].
 * 8) Behaves like [j], but closer to velar than palatal for most speakers. Some educated speakers (especially in urban areas) realize this phoneme exclusively as [j].

Pitch-accent
The pitch-accent of Souk developed from the pitch-register system of Old Souk, in which certain tones and phonation contrasts were dependent on each other, and neither could exist independently. For example, low tone only existed in vowel-coda syllables, and almost all falling tone syllables were nasal-coda. This has evolved into a highly predictable modern pitch-register system.

In modern Souk, there is the middle/level tone and the low-falling tone. High-falling tone long ago merged with low-falling, and rising with middle. With the exception of adpositions, all nasal-coda syllables feature low-falling tone. Any stop-coda syllable will have level tone, unless the vowel is long, in which case usually the tone will be low-falling. Vowel-coda syllables, short or long, by default have low-falling tone, unless the syllable is an adposition, a loanword, or features tabla (laryngealization).

With these guidelines, we know that rwaan (mango) is pronounced as low falling [r.wàn] and rwaat (macaw) is level [r.wat̚]. One would assume that the word tabla would have low tone; however, the final syllable is laryngealized, thus the word features level tone. As another example, consider phe:raaosa su:k [pʰe.əˀra.o.sɑ ↘sʉ.ək̚] (Souk language); due to the rule of low falling pitch on long vowels before stops, as in su:k [↘sʉ.ək̚], we would assume that phe:raaosa [pʰe.əˀra.o.sɑ] would also be low falling, due to the long vowel e: [e.ə]. This would ordinarily be correct; however, [əˀra] is actually a genitive infix, and as such the root word is pheosa [pʰe.o.sɑ] (language), which has level pitch. The fact that the infix elongates the preceding vowel [e] does not thus produce nor contribute to a change in pitch.

The pitch-accent system is a global system. In a given clause, the accented syllable features a sudden drop in pitch, and following syllables continue to fall gradually in pitch, including particles and postpositions (even if they have level pitch in isolation). This is why we usually describe a drop in pitch as a global fall.
 * 1) Notice how phan is the accented syllable, yet the following three bound morphemes continue to fall in pitch until the end of the compound word.

In the example above, the fall in pitch takes place over the orange syllables. The first syllable is the location of the accent, and the following syllables continue to fall in pitch (including the rest of the word and its postpositions).

Tabla
Old Souk allowed for virtually any consonant as well as some consonant clusters to exist in syllable coda. Due to the development of Old Souk as a more common, colloquial language, as well as the great influence of other local Austroasiatic languages, many of these coda phonemes merged with nasal consonants and unreleased stops. Words that underwent this merger developed a laryngealized sound, somewhat reminiscent of creaky voice or a glottal stop, which is pronounced just before the final consonant; most often syllables with this feature formerly had [h] in coda. This feature is known natively as tabla [tɑb.lɑˀ]; a word which, in Old Souk, would most likely have been pronounced [tɑhb.lɑhk].

Syllable structure
Most words are monosyllabic. Syllables follow the form (S)CV(X)(F), where C is any consonant (including a glottal plosive), V is any vowel (long or short), X is an approximant /j/ or /w/, and F is a single final consonant (nasal, unreleased stop, [b], [ð̞] or [s]). S represents a sesquisyllable, which forms a sort of cluster with the initial consonant. There is a list of all coda consanants in the next section. Souk is a mora-timed language, which means that any sesquisyllable represents its own mora, and is thus pronounced for the same amount of time as the mora of the rest of the syllable. Sesquisyllables are permitted only at the beginning of a word; that is to say, a multisyllabic word may not have a sesquisyllable pattern on its second and third syllables and so on.

Romanization

 * Main article: Kai-Souk Colonial Alphabet

The writing system of Souk utilizes many letters which represent different phonemes at different parts in a word or syllable, and many other such exceptions exist. As such, it would be counter-intuitive to present an ambiguous transliteration system; thus the Kai-Souk Colonial Alphabet, which is used to render most of the Kai-Souk languages into the Latin alphabet, attempts to present a straightforward representation of the sounds in the most common dialects.

This table represents the letters with their standard IPA values for Souk:

Being almost completely predictable, pitch accent and tabla are normally not indicated in romanization. If absolutely necessary, a drop in pitch can be represented with a grave accent (ph à n) and tabla laryngealization with umlaut (tabl ä ).

Grammar
Souk grammar is straightforward albeit unique in many ways. Nouns in Souk have no gender or number, and they do not decline. Their position in a sentence determines their function or case. Verbs have many suffixes and some infixes which determine their aspect, mood, and voice, and adjectives behave more like nouns; all this we will go over below.

Adjectives
Souk does not have literal adjectives, as in 'a sad boy' or 'an angry man'. Instead, there are nouns which represent quality, ie. mad (melancholy, sadness) or khloong (anger); we call these abstract nouns. To use an abstract noun as an adjective, you simply attach the genitive particles -ra or -sim to the abstract noun, which can appear before or after its head:

Sentence structure
The building-blocks of a sentence can be separated into five categories: the agent; its patient; the abstract verb-phrase, which treats the agent as an object or passivizes it; the literal verb-phrase, which denotes the action of the agent; and the dative or benefactive. Each category can be either a single word or a simple clause.

Consider the sentence,

The above translation is the literal semantic meaning of the sentence. The last four words are indeed 'laity donate food monks'; however, the first word mneus 'inspire' is an abstract verb, and treats the agent of the clause like an object. The agent of the abstract verb is inherently unknown; it is said the abstract verb represents a realm of understanding beyond the physical, treating actions and agents as one with each other.

Statements without an abstract sound very informal and almost immature. It is uncommon for even colloquial speech to go without an abstract verb, except in brief two- or three-word utterances and gestures. However, the abstract verb-phrase can be replaced with a relative clause for the agent, which is common in more complex sentences:

In the above example, we can see that the abstract verb-phrase has been replaced by a relative clause. The relative clause works in the same way that the abstract verb-phrase normally works: it modifies the agent of the sentence, giving us more information about it. The dative/benefactive case is often ambiguous, as it can also be used in other ways, such as the following:

In the above example, the dative acts as a NON-BENEFACTIVE; by placing a valent term such as 'wealth' before the dative, it becomes apparent in context that 'health' is not considered and cast aside; compare to

We can see here that the conjunction ni binds 'health' and 'wealth' together, thus removing any contrast they would produce in simple juxtaposition. On another note, the verb si (hold, behold) has a meaning of possession or temptation when used in the abstract. It thus shows that the politicians were barely able to control themselves, as if possessed by greed entirely. Si is also rarely used in abstract form to refer to a single agent; thus we can understand that we are referring to several politicians, and not just one.

In an intransitive clause, there is normally no abstract verb, and instead the initial verb is literal. Consider,

Here we can see that the initial (and only) verb is a literal verb. If we want to indicate the same abstract idea which would exist in a transitive clause, we can use an abstract noun in the dative case:

We can see that the dative case here is more of an instrumental or comitative. Now if we decide to make this sentence transitive (ie. we know what she could not choose), we will use the abstract verb sam, which means 'driven by melancholy' in the abstract and 'to be sad' in the literal:

Since the verb khing (speak) is in a position in the sentence where verbs do not normally appear, we can tell that the verb is used in its infinitive or nominal form.