|Japanese - handling loans from (middle) Chrinese|
Posted by: Karl B. on Dec. 12, 2019, 21:54Hi there,
I would like to suggest a correction for the Japanese vocabulary list. While the most common reading for 二 in Japanese is 'ni', this is actually a loanword. The native version, 'futa', is only found in counter words these days, such as ?? (futari). Otherwise the list uses either the native word or a variation on it. 死 (shi) is a loan from Middle Chinese that merged with the native Old Japanese word, so take that info as you will.
Thank you for your work!
Posted by: Vincent on Dec. 4, 2019, 19:38Thank you very much for your comments which are very helpful. The version of the research you see on the page is last updated in January 2017. I have gone much farther since then, using advanced statistical methods as well as machine learning and I am preparing a paper, for which I already received review and valuable feedback from Mattis List and Gerhard Jäger - two outstanding researchers in the fields. In my "new" system, I am now using 2200 languages. One focus is the identification of long range relationships. Unfortunately, for Japanese, my system fails to find relationships with other language families with an acceptable statistical significance. So any improvement in the use of the right source - where the exclusion of loans and the use of older roots - if available - could help to perhaps trigger an interesting result.
The source I use in the final Version is ASJP (ASJP Data - table enclosed). This source reports following words as loans: Hifu (Skin), Taiyo (Sun) and Ippai De (full), but the source has also alternative non loans for two of these three words. Perhaps you see more words in the list which are not adapted for my purpose and know other roots which may help. If so, please send me your comments.
I have a more up-to-date query form (inofficial, not the final one, so differs slightly from raw ASJP) with more languages online: Experimental extention
Posted by: Vincent on Dec. 4, 2019, 21:31Thank you for the reply!
I took a look at ASJP's Japanese wordlist and it's unfortunate and misleading that a lot of words that contain loans are not listed as such. Here are some points of consideration that I have come up with. Just as shorthand, I'll be referring to Middle Chinese as MC and Old Japanese as OJ.
Since you are already aware that ippai de 'full' is a loan, you could also consider adding mitasu 'fill up; satisfy'.
Kono ha 'leaf' is actually a compound of ko 'tree' + -no 'attributive' + ha 'leaf', making ha 'leaf' probably the more relevant entry. This same ha is likely related to ha 'tooth', so if there was a separate native word in OJ for either leaf or tooth, it might be beneficial to look for it. I am not currently aware of any.
The second entry for name, seimei, is a MC compound. The first entry is likely the original OJ word for name.
The first entry for 'new', shinsen, is a compound of two MC loans, as well. Japanese has some other native words for 'new' that could supplement it, such as arata 'new, novel; fresh' (though this is closely related to ataraSi), or Kansai dialect sara 'new; unused'.
The second entry for 'stone' has much better alternatives. Consider replacing it with iwa 'rock' and/or iso 'pebble; gravel', both of which were present in OJ and are not compounds that contain MC affixes like koiSi sekizai does.
The Japanese pronouns listed are OJ innovations and not necessarily 'original'. For example, the word used in this list for 'we', wareware, is a reduplicated version of ware, itself composed of wa 'I, me' + -re 'nominalizing suffix'. Moreover anata 'you' is composed of a- 'distal marker' + -na 'possessive/attributive' + ta 'direction', all together meaning literally 'in that direction/place' which semantically shifted to 'that person'. If there were pronouns for 'we' and 'you' in OJ that are irreducible as such, I am not aware of them.
Otherwise I believe your list is satisfactory. To be honest, I am not sure implementing any of this will have any great statistical effect, but of course it is better to be disappointed in accurate results than inaccurate ones.