The etymology of six

When it comes to reconstructing the Proto-Indo-European numbers, it doesn't get more tantalizing than the number "six." I say tantalizing because it is extremely well attested throughout the late Indo-European tongues, but reconstructions point to three different proto-forms! This is a far cry from the much more uniform reconstructions of other numbers under 10. To add to our troubles, no Anatolian language has recorded the number in any phonetic way; the Hittites and Luwians opted for number signs, much as we can write "6" in place of "six." Here is a list of the number six in the oldest IE tongues. 

Italo-Faliscan Latin sex ; Venetic segtio "sixth" ; Oscan sehsík ; Umbrian (first element in) sesten-t-asiaru
HellenicGreek ἕξ ; Dorian ϝέξ
Balto-SlavicLithuanian šešì "four" ; Old Church Slavonic šestь ; Old Prussian




Celtic Old Irish ; Gaulish suexos "sixth" ; Celtiberian (possibly)sues
Albanian gjashtë
Tocharian Tocharian A ṣäk ; Tocharian B ṣkas
Germanic Gothic saihs ; Old Norse sex ; Old High Germanic sehs
Indo-IranianSanskrit ṣáṣ- ; Young Avestan xšuuaš

For Anatolian languages, Blazek (2000) turned to Kartvelian languages for a clue. Georgian ekvs- , Svan usgva and so forth, point to Proto-Kartvelian *eks1w-. Perhaps, Blazek speculates, this came from some Anatolian language. But for now, let's set Blazek's notes aside and move on to the problems.

Problem 1. The Germanic languages and most Balto-Slavic langauges point to *sek̂s. The Italo-Faliscan, Celtic, Albanian, Indo-Iranian and Hellenic languages point to *su̯ek̂s. Tocharian languages can go either way.

Problem 2. Old Prussian and Armenian lemmata (in red) point to an absence of the initial *s-, and instead to *u̯ek̂s. And if Proto-Kartvelian borrowed their number from the Indo-Europeans, then it points to an absence of *s- as well. 

This problem is tricky. It resists traditional isoglossic maps. There is an *s that wants to appear and disappear and there is a *u̯ that wants to does the same. 

Most linguists believe that six was contaminated by seven. One can wonder if *su̯ek̂s "six" lost its semi-vowel under influence from  *septḿ "seven," but there is no evidence to suggest *sek̂s acquired the semi-vowel ex nihilo. So the *u̯ is agreed to be original, and lost in some languages under influence from seven. To explain the unexpected Old Prussian and Armenian lemmata, Kroonen (2014) believed *u̯ek̂s as original in Proto-Indo-European and acquired the *s-initial under influence from seven (yes, contaminated twice).

Mallory & Adams (1997) conjecture the number was borrowed from an Afrasian source, thanks to the high degree of variation. Speculation of borrowing should be taken carefully, especially since the number of lexical material from Afrasian languages is sparse. If Kroonen's reconstruction is true, however, then Mallory's argument is dealt a serious blow since similarities with Afrasian languages were due to coincidence and not relationship.

Hamp (1978) took an entirely different road modeled, in part, off of Pokorny (1957). Because some Indo-Iranian languages like Pali have reflexes that point to *ks-, and because Proto-Indo-European *ks- would have reduced to *h- in Greek and disappeared in Armenian, Hamp proposes the original Proto-Indo-European form was *ksu̯ek̂s. His proposal has not found strong support.

Martirosyan (2015) proposed that  *su̯ek̂s had a Lindeman variant *suu̯ek̂s. (In this case, a Lindeman form refers to mono-syllabic words with a semi-vowel are extended: *su̯ek̂s >  *suu̯ek̂s). In later stages *suu̯ek̂s would have reduced to *u̯ek̂s and finally yielded vec' in Armenian. In other Indo-European branches, *su̯ek̂s simplified to  *sek̂s. No comment from Martirosyan, however, on what  *suu̯ek̂s would yield in Old Prussian; nor is there mention that Lindeman's argument has problems of its own.

