Austronesian etymology and loanwords

I looked around and have no idea where to share this info. I realized I’ve created a bunch of threads on my small random discoveries in the past. Perhaps with this more general title, mods can merge some of them here as well.

Panay is a popular Pangcah Amis girl name. It means rice in the fields. It’s also the name of the lead character of the 2015 film Wawa No Cidal. The English title for the movie is simply Panay.

The word Panay came from the PAn root word *pajay.

In Taiwan, *pajay became panay, pagay, pazay, palay, paday. When Austronesian langauges expanded out of Taiwan, all these forms were also propagated as well.

In Tagalog, we have pálay meaning rice plant, and palay-an meaning rice fields. In some languages, pagey means unhusked rice.

The word came to mean rice fields in some areas, such as in padi in Malay and Bali.

When the English got to Malaysia, they borrowed the Malay word for rice fields, and that’s how we ended up with paddy, as in rice paddies.

Ultimately, that English word came from Taiwan.


Just saw a word list of reconstructed Proto-Kra-dai, which is a language family that includes Thai.

Gloss Proto-Kra–Dai
blood *pɤlaːc
bone *Kudɤːk
ear *qɤrɤː
eye *maTaː
hand *(C)imɤː
nose *(ʔ)idaŋ
tongue *(C)əmaː
tooth *lipan
dog *Kamaː
fish *balaː
horn *paquː
louse *KuTuː
fire *(C)apuj
stone *KaTiːl
star *Kadaːw
water *(C)aNam
I (1.SG) *akuː
Thou (2.SG) *isuː; amɤː
one *(C)itsɤː
two *saː
die *maTaːj
name *(C)adaːn
full *pətiːk
new *(C)amaːl
‘thin’ *C-báːŋ
‘bone’ *Cudə́ːk
‘boat’ *Cuɖáː
‘borrow’ *C-ɟáːm
‘village’ *Cəˀbáːnʔ
‘winnow basket’ *Cəˀdóŋʔ
‘to stand’ *Cəˀɟún
‘dog’ *kʰ[u]máː
‘ditch’ *[t]-m̥ˠáːŋ
‘ant’ *r-móȶ
‘bear’ *kəˀmˠúj
‘thick’ *tsəˀnáː
‘cold’ *kəˀȵít
‘stupid’ *Cəˀŋáːŋh
‘gills’ *Cəˀŋˠáːk
‘taro’ *pəˀrˠáːk
‘moan’ *gəˀráːŋ
‘hungry’ *məˀjáːk
‘stupid’ *Cəˀwáːʔ

I mean, most of those are almost exactly the same as Proto-Austronesian…

Gloss Proto-Kra–Dai Proto-Austronesian
blood *pɤlaːc *biRaq
eye *maTaː *maCa
hand *(C)imɤː *qa-lima
tongue *(C)əmaː *Sema
tooth *lipan *ŋipen
louse *KuTuː *kuCux
fire *(C)apuj *Sapuy
water *(C)aNam *daNum
I (1.SG) *akuː *aku
Thou (2.SG) *isuː; amɤː *i-kaSu
one *(C)itsɤː *isa
two *saː *duSa
die *maTaːj *ma-aCay
‘borrow’ *C-ɟáːm *Sezam
‘bear’ *kəˀmˠúj *Cumay
‘taro’ *pəˀrˠáːk *biRaq

So many cognates. Although, the two seem to have split before boat was invented, as it’s *Cuɖáː in Proto-Kra-Dai and *qabaŋ in Proto-Austronesian.


So in 2008, Sagart proposed that Kra-Dai languages derived from Puluqish speaking people from Taiwan who migrated back to the Asian mainland.

Puluquish or Puluqic are basically one linguistic innovation away from those who migrated to the Philippines and became the Malayo-Polynesian people. Compares to the PAN, Puluquish already went through a series of changes, such as using Pitu for 7 (Pituish) as an abbreviation of *RaCepituSa ‘five-and-two’, replacing five *RaCep with the word for hand, *lima (Limaish), replacing six from ‘five-and-one’ or ‘twice-three’ with repeating 3 twice, *Nem-Nem > *emnem (Enemish), in addition to 5 and 6, the number eight and nine are formed as *RaCepat(e)lu ‘five-and-three’ and *RaCepiSepat ‘five-and-four’ (Walu-Siwaish), before we finally get to Puluquish, which forms the number ten as *sa-puluq, from *sa- ‘one’ + ‘separate, set aside’.

Formosan languages fall all over this spectrum, Amis, Puyuma, and Paiwan all fall into the Puliquish language sub-family. The number system is one aspect of languages that usually changes very slowly. So the fact that Proto-Kra-Dai shares all the innovations for 3, 5, 6, 7, 8, 9, 10 shows it didn’t branch off from PAN and developed in parallel with languages in Taiwan.


Kra-dai and Austronesian comparison

1 Like

Sagart’s presentation on how he concluded KD is a sister branch of MP.

1 Like

I just realized that the word for sea turtle in Proto-Austronesian is *peñu, and in living languages through out Taiwan and the Philippines, the word is either penu, peno or panno.

This could probably be the etymology for Penghu, which was originally written as 平湖 and would have been pronounced as pênn-ôo.

This word is kept even into the Maori language, where sea turtles are called honu in Te Reo.

The word for eel, *tuNa, is also kept in Polynesian langauges, and is still tuna in Te Reo. The act of catching eel using lights is called rama tuna.

The amazing Languages to Learn channel has an episode explaining why the word for fresh drinking water is different across Austronesian languages and how they are all related.

1 Like

Don’t know where to put this, but Brian Loo Soon Hua basically compares Paiwan phonology and grammar with Malaysian and Filipino languages. It’s pretty amazing, and probably the best explanation of focus markers I’ve ever heard.

I am not a linguist, but have heard that Chinese linguists classify Tai-Kradai languages as Sino-Tibetan. Your point is, are they Austronesian?

Sagart made his assessment based on linguistic evidence, Chinese linguists made their conclusion based on political needs.

1 Like七

Kaxabu’s 7 is xasepbidusa, which retains *RaCepituSa perfectly.

This Kaxabu audio dictionary by Phuann Kim-gio̍k and published online by iTuan is amazing.

Kaxabu numbers:

Number Kaxabu PAN Math
1 adang ? 1
2 dusa *duSa 2
3 turu, tu’u *telu 3
4 sepat *Sepat 4
5 xasep *RaCep 5
6 xasepbuza *RaCepesa ? 5+1
7 xasepbidusa *RaCepituSa 5+2
8 xasepbitu’u *RaCepitu’u 5+3
9 xasepbisupat *RaCepiSepat 5+4
10 isit ? ?

So Tai-Kradai and Austronesian are related? Originated from the same common language ages ago?

Sagart’s position is that the Kra-Dai and Malayo-Polynesian are sister languages.

You can refer to the list of obvious cognates in Proto-Kra-Dai and proto-Austronesian in the list I compiled.

Back then I said the word for boat in Kra-Dai, *Cuɖá, seems to be unrelated to *qabaŋ in Proto-Austronesian. I was wrong, *Cuɖá seems to be the cognate of the PAn word for oar, *aluja. It went through a semantic change where the oar eventually meant the entire boat, as suggested by the video I referenced here:

So we have Austro-Asiatic, Austronesian, Hmong-Mien, Tai-Kradai, Sino-Tibetan families in our region. Could be all related, but I think it is unsolvable problem to prove that. What they have adopted from each other by living side by side in the same region, or what they had originally had in their languages? We can just stick with different probabilities, make different Swadesh lists, can play with statistics accordingly. By structure, all these languages are pretty much the same. It is like Nostratric family hypothesis, which says that basically all languages are related and finds similarities between, for example, the Caucasus languages like Georgian and Sino-Tibetan. By the way, what do you think about Altaic: (Turkic: Mongolic: Tungusic: Koreanic, Japonic): ? Are they the family?

Trans-Himalayan languages are structurally different from Austronesian languages both in phonetics and grammar.

PAn is verb-initial language, and the order could be VSO or VOS because there are also focus markers such as a, nua, tua, i, sa and so on.

None of that is in Trans-Himalayan languages.

Sometimes we can tell whether a word is a loan word or not by checking to see if it fits into the phonetic system of the language. You can even figure out if a word is borrowed within a language family, such as Pinuyumayan borrowing the word for peach, ɭupas, from Pangcah’s lopas because had it naturally evolved from PAn, the word would conformed to Pinuyumayan sound changes, and would less likely to be an irregular sound change.

But generally you can distinguish these families: Austro-Asiatic, Austronesian, Hmong-Mien, Tai-Kradai, Sino-Tibetan? I guess there are some clear criteria, like Vietnamese is Austro-Asiatic, but most of its vocabulary is from Chinese?

Most linguists don’t use the term Tai–Kadai anymore. There is no r if you want to call it Tai-Kadai. There are two main branches of the Kra-Dai language, Kra and Dai, hence the name. Dai is the reconstructed proto-form of Tai.

It’s obviously easy to tell most of these languages apart now. However, just like how we’d classify animals now, we classify languages based on their lineage. Sagart is simply presenting evidence to suggest that Kra-Dai is a branch of Austronesian language family. It’d be pretty similar to how easy it is to tell tigers and lions apart, but still classify them under the felidae family.

Most of English’s vocabulary is from Latin and French, that doesn’t make it a Romance language. Vietnamese, Japanese, and Korean all have large amounts of Sinitic loanwords.