The Etymology of Eight... or Four?

All the numbers in Indo-European languages have interesting origins. We saw how the number six has three different reconstructions and each is a head-scratcher, and how the number seven probably was borrowed from Proto-Semitic. Let's continue our journey today with the number eight.

AnatolianLycian aitãta-
Phrygianotuvoi "eighth"
Italo-Faliscan Latin octō ; Oscan uhto
HellenicGreek ὀκτώ ; Elean ὀπτω ; Lesbian ὀκτό ; Macedonian otto-
Balto-SlavicLithuanian aštuonì ; Latvian astoņi ; Old Prussian astōnjai ; Old Church Slavonic osmь
Armenian ut'
Celtic Old Irish ocht ; Welsh wyth ; Gaulish oxtumetos "eighth"
Albanian tetë
Tocharian Tocharian A okät ; Tocharian B okt
Germanic Gothic ahtau ; Old Norse átta ; Old High Germanic ahto
Indo-IranianSanskrit aṣṭā́ ; Young Avestan ašta but note Young Avestan ašti "length of four fingers"
Non-IEGeorgian otxi "four" ; Georgian (dialect) otxo "id." ; Svan woštxw "id." ; Etruscan huθ "four" or "six"

Only Lycian is attested among the Anatolian languages and there is a separate attestation in one of the rare Indo-European branches, Phrygian, one of the barely attested Balkan tongues that died long, long before the modern era. Let's get some of the phonological oddballs out of the way. Elean and Armenian were reshaped under phonological influence of the number seven. Albanian tetë took a lengthy route: early Proto-Albanian used a numeral suffix *-ti (*okto-ti); in the middle of Proto-Albanian, the *o-initial raised to *a while the *k disappeared; in late Proto-Albanian the *a disappeared and the numeral suffix *-ti was re-analyzed as a masculine suffix and replaced by feminine *-ta via analogy (Hamp 1992). 

Lycian is unexpected, especially the nasalization in the middle. Melchert (1993) assumed PIE *okt- with suffix *-ont- for "eighty" with later semantic change, which solves the nasal problem from a phonological angle. Still, however to get the first syllable *ait- in Lycian from *okt- in Proto-Indo-European, one still needs to explain the disappearance of the *k. To solve that, Melchert assumed voicing of the *kt to *gd (*g > *y is a regular development), which is ad hoc. Not saying it didn't happen, but we need to remember this. A theory by Beekes (1988) was later revised by Blazek (1998) to modify Melchert's suggestion: Almost all Indo-European language are derived from an *H-initial, so why not Anatolian? With only one example of "eight" in the Anatolian family, explanations are disappointingly difficult, but Blazek believes that *Ho- combos were reduced to *o- early in, or just before, the Proto-Anatolian period. 

Elean and Armenian aside, the Greek and Italo-Faliscan point to *h3eḱt-éh3, Germanic to h3eḱt-éh3(u), Celtic to *h3eḱt-oh1, Tocharian to *Hoḱt-ōu. This all points to a more phonologically accurate oxytonic Proto-Indo-European form of *h3oḱt-oh3 with its ancestral Early Proto-Indo-European form of *h3eḱt-eh3 (not to get too involved in the Leiden school and whether most vowels derive from *e). It is my supposition that the rounding nature of *h3 in the suffix *eh3 began to vacillate into rounding (as we see in Germanic), but had its effect even after compensatory lengthening (as we see in Tocharian).

More importantly, forms like Greek and Sanskrit can be interpreted in the dual form - as opposed to singular and plural. Further, many other languages deriver from an o-stem as well (e.g. Latin, Oscan, the Tocharian and Celtic languages, and perhaps Phrygian). This is important because it demonstrates that the number eight derives as a dual form of the number four (i.e., saying 'two fours').

But say the o-stem was due to chance and coincidence? After all, not all the Indo-European languages can be identified as a dual number. What if it was an o-stem that was later interpreted as dual by folk etymology? Fortunately, there are more evidences to consider.

Avestan ašti "length of four fingers" is hauntingly similar to ašta "eight," but is an i-stem and points to either *oḱti or *oḱsti in Proto-Indo-European (there's no way to tell which without a cognate in another branch). We are fortunate enough to have a solution to the i-stem problem: this is likely a formation from a numeral length stem *-ti-, meaning the root was *o- or *oḱs- "four" (Blazek 1998). 

There are also two major non-Indo-European pieces of evidence that "eight" comes from "four." The first are Kartvelian numbers for "four" that look remarkably similar to Indo-European numerals for "eight." The Georgian and Svan examples reflect a Proto-Kartvelian *otxo- and can come from a late centum Proto-Indo-European dialect (dialects that become Germanic, Tocharian, Celtic, Italo-Faliscan, etc...) (Klimov 1998). The second is Etruscan huθ with an unknown meaning of either four or six, because its attestation on Tusciana dice is ambiguous, but if not then the reverse is true. Bomhard (2008) was the most recent mention of this I've seen. This isn't as interesting to me, since you'd need to explain how *hu- is related to the third laryngeal that was likely */ʕʷ/ and an o-colored *e, but it's food for thought.