To Cantonese tones: You can legitimately claim 6, 7, 8, or 9. Anything other than that you can discount. 1 difference comes from whether you need to count high falling as separate from high level or not. The other difference comes from whether you count the two “entering” tones as separate tones. In the case of this project you would need to keep the high falling tone and ignore the entering tones because they have a distinct sound (they will all be /p/, /t/, /k/ endings).
As to the number of splits: 450-1000 splits would be acceptable. We’d still get the number of characters to just a little over 2000. Most of the sound/tone change between dialects is systematic. Cantonese is a very conservative dialect in terms of keeping Middle Chinese pronunciations and preserving characters, while Mandarin is on the radical end of the spectrum. I think the two give a good rough estimate with the overall splits we’re looking at.
Let’s look at all the characters that show up in NJStar for zuo4:
作 has 3 Mandarin pronunciations zuo4, zuo3, zuo1- but all come out to jou6 in Cantonese
做 jou6
坐 cho5 jo6
座 jo6
鑿 has 4 pronunciations in Mandarin zao2 zuo4 zu2 zao4, but all come out as jok6 in Cantonese
柞 zuo4 ze2 zha4 in Mandarin, ja6 jaak3 jok6 in Cantonese-- but differences are systematic. Where it differs in Mandarin it differs in Cantonese in the same way.
祚 jou6
怍 jok6
阼 jou6
胙 jou6
夎 (no data) 酢 zuo4 cu4 in Mandarin, chou3 ja3 jok6 in Cantonese
葄 jou3
Of all these, only the one in bold would need to have an additional character added to accomodate sound differences between the dialects. The others would be systematic and would be represented by the alternate sounds. In the case of 酢, you would only need to add that character as an extra. There would already be a character for Cantonese jou6, jok6, jou3, etc.
You’ll note that sounds with /p/, /t/, /k/ endings in Southern dialects will result in the biggest number of differences with Mandarin since Mandarin does not have any equivalent to the “entering” tone category of Middle Chinese which these sounds in Southern dialects preserve.
First, thanks to all above for the many great comments! This is a very interesting and naturally controversial topic.
Have you read DeFrancis on this? (anti-DF flamers, pls control yourselves for just a moment, tks!). He begins with debunking various myths on Chinese, especially the ideographic myth, and focuses in large part but not exclusively on how Chinese characters do incorporate more phonetic information than cocktail lounge discussions generally recognize. He discusses the nature of written languages in light of various criteria, one of which is the phonetic efficiency thereof. He definitely comes down on the side of phonetically efficient scripts, but for very practical reasons, and not for reasons of cultural bigotry.
Among other options for improving the phonetic efficiency of the Chinese script such as romanization, he brings up the option you cite, which is to reduce the numer of graphs to one per spoken syllable. However, he is ever the realist, and acknowledges that this would be impossible without reform of the style of written language as well as spoken forms such as some news broadcasts which currently retain formal, literary style. He also discusses options such as the retention of larger character sets for scholarly purposes, while using either the minimalist character set or even better, something like romanization for purposes of computerization and cataloguing.
Thus, your idea is not crazy at all, but one which has been entertained by scholars who have spent a lifetime considering such issues. Of course, there are other relevant criteria, not just efficiency… the semantic differentiation of homophones is more important in Chinese than in most languages; and the cultural history and aesthetic considerations should certainly not be neglected in such a discussion!!! DeFrancis has been quick to say as much himself, and feels that, from an aesthetic perspective, the Chinese script is his favorite in the whole world!!!
This isn’t a comprehensive answer, nor is it my own perspective, so no need to lash out at it… Just sharing info on a relevant perspective with the original poster.
-K
Yeah, I know it isn’t a new idea. The suggestion I’m keen on is creating a minimal list that is cross-dialectal. But thanks for pointing out that I can’t be a total moron.
It’s the aesthetic argument, ironically, that I think is the most powerful. Personally, despite the fact that I’m the guy who is originating the idea, I’d like to keep Chinese as it is and not make any real changes. But the idea of making changes to improve education (and with a side effect of benefiting future 2nd language learners) does have merit. I like the idea I’m suggesting here better than other suggestions for a romanized script.
For problems of homophones, for the immediate future you could keep the original character (or use it in parantheses) if there is serious difficulty with understanding without that character. I don’t propose just chucking the original character script-- ever. It remains to be seen how a minimal character script function in relations to the entire set of characters if it were ever to take root.
I can imagine it being a starter set from which serious students will grow into the full set. But by learning the minimal set it would enable you to read and write Chinese fully, just not to access the higher learning available for the full script. They wouldn’t have to learn the full script to be literate and function with reading and writing, but it would be an advantage to do so.
Anyway, there’s a lot more about this to be discussed even if we can come up with a practical, functional model.
Thanks for the input Dragonbones. I have been following the other thread about DF’s book too. There are some really good dicussions going on at Forumosa at present and some really good arguments being put forward. Though it may not sound like it some times, I am very humbled by the proficiency many forumosans have in Chinese.
While cultural is probably one of the most difficult obstacles to overcome in language reform, I feel it needs to be high on the list of considerations.
pwh I know I am oversimplifying here, but doesn’t bpmf/pinyin already serve that purpose?
If more books and publications used characters in combination with phonetics then even us westies could pick up newspapers, etc. and read them. (This would probably slow native speakers progression). BTW I would really hate to see this happen to. (mixed feelings actually - it would be really helpful!!)
(If they could put the pinyin on the bottom would be nice, so I can cover it up and only peek when I need to.)
I just had a brain strorm (and can’t believe I’m sharing it with you all )
It may have been suggested before.
Use bpmf as the phonetic component along with radical for meaning.
(I can’t believ I just said that :loco: )
of course there is still the cultural problem.
bpmf could be used to reflect regional dialects and they can have their own variety of the character. (problematic I know, but dialects are always going to be problematic if you’re after phonetic encoding)
then there is the whole Taiwan mainland issue.
Back to the drawing board?
PS Woodchild. lol = laugh out loud
Thanks also for your input on these topics.
And they, in turn, are probably humbled by how long it took them to reach it! I mean, if I’d spent this much time on learning Spanish, I could probably teach it to Mexicans!
No, because bopomofo and pinyin are different for each dialect.
Take this sentence for example:
台灣是否屬於中國? (Does Taiwan belong to China?)
Cantonese:
/toi/ \waan\ sih fauh suhk yuh \jung\ gwok?
Mandarin:
tai2 wan1 shi4 fou3 shu3 yu2 zhong1 guo2?
Anybody here able to give the Hokkien pronunciations of the characters?
You would need as many phonemic based scripts as there are dialects in order to give the pronunciations. If you are just thinking of working with one dialect then you could have something like bpmf.
What I’m suggesting would cross dialectal boundaries. Right now each dialect has a pronunciation for every standard character (and some non-standard ones). I’m suggesting reducing the characters by getting rid of characters with redundant sound correlations.
And they, in turn, are probably humbled by how long it took them to reach it! I mean, if I’d spent this much time on learning Spanish, I could probably teach it to Mexicans! [/quote]
[quote=“puiwaihin”]Take this sentence for example:
台灣是否屬於中國? (Does Taiwan belong to China?)
Cantonese:
/toi/ \waan\ sih fauh suhk yuh \jung\ gwok?
Mandarin:
tai2 wan1 shi4 fou3 shu3 yu2 zhong1 guo2?
Anybody here able to give the Hokkien pronunciations of the characters?
[/quote]
dai wan si beo sio? hyi dia~ go?
[quote=“puiwaihin”]
What I’m suggesting would cross dialectal boundaries. Right now each dialect has a pronunciation for every standard character (and some non-standard ones). I’m suggesting reducing the characters by getting rid of characters with redundant sound correlations.[/quote]
What you are proposing is a collapsed character set based on reconstructed Middle Chinese (possibly even before).
Not exactly. It will be based on the modern dialects, and we’ll be accounting for differences in the way the characters diverged from Middle Chinese. We won’t actually have to use any reconstructions, but I’d wager information from such reconstructions will have interesting paralells to the character set I’m suggesting.
I think the first step we’d need to take would be to get a list of all sound and tone combinations in use of a representative language in the major dialect groups: Mandarin (putonghua), Yue (Cantonese), Wu (NOT Shanghainese), Hakka, Min (Min-nan), Gan, and Xiang. From this set we could derive a minimal list needed to cover the largest number of sound tone combinations.
The next step would be to see how many extra characters would be needed with all the tones of just one sound (or perhaps 2 or 3 that may be interconnected). If the number of characters that needed to be added to the minimal set was manageable we could move forward.
Now, before I get too ambitious and start recruiting people to help on the project, why won’t this work? Honestly. I’d prefer to be told why it’s stupid before I commit a few thousand hours to a project like this.
It would work, but only if you convince the language reform people.
The same procedure (phonetic collapsing) was applied to some characters in the PRC simplification, at a cost of semantic conflation, causing great consternation to our dear Traditional Chinese compatriots. Some examples:
出发:出發
头发:頭髮
面孔:面孔 面条:麵條
皇后:皇后 后方:後方
漏斗:漏斗 斗争:鬥爭
一只:一隻 只有:祇有
系统:系統
关系:關係
Some of these were already in common use anyway; and some have shown up in Taiwan, too. In other words, people, out of laziness, will do this naturally. This is an ongoing process. Long ago, there was a distinction between the following two:
天才:天才
刚才:剛纔
But nobody even remembers such things.
One more note. This sort of collapsing was made possible by the disyllabic and multisyllabic nature of Chinese word construction since about 2000 years ago. In some sense, collapsing could be “afforded.” An effect of further carrying this out in the future Chinese language is the full multisyllabilization (lengthening) of words, as has happened to languages that alphabetize, such as Vietnamese.
I’m not really for changing the system as much as I’m against switching to a romanized system or going to bpmf.
Hmm. Guess what I could do is develop this option for a phonetic script and never promote it, except as an intellectual pursuit or a learning aid, until someone gets serious about abolishing characters in favor of pinyin or its like. Then I’d promote this idea as being preferable. Considering that both the KMT and PRC both seriously considered switching to pinyin, it may happen.
I’ve only skimmed post of the posts which seem to be discussing the desirability of reforming Chinese. I htink both English and Chinese could be completely reformed, but it’ll never happen.
Back to the idea of the OP. I think it’s fun to discuss as a theoretical possibility.
You want to make one character for every syllable, leaving about 2000 characters. Good idea, but I think it better to have one for every character irrespective of tone. That’d give you about 400 characters. Easy.
So you have a basic 400 syllable phonetic (or maybe phonemic, I forget the difference) set. You could use these 400 to write anything in Mandarin. Tone could be indicated somehow with a tone mark. Then you could refine them by sticking a meaning radical next to them. So you’d have two systems of writing. A simple phonetic one, and a more complex (but significantly less difficult than current writing) one with as much complexity as current writing. Students could write purely phonetically. Anything written in the full system could still be read by anyone who new the basic 400 characters.
As for your idea abpout making it compatible with all major Chiense dialects (languages). I don’t think this would work. So many syllables of Cantonese, Taiwanese etc just don’t have Mandarin equivalents. The result would meansomething written by a Mandarin speaker would be unintelligble to a Taiwanese or Cantonese speaker, so why bother?
Well, the Mandarin speaker would have to write differently for the different sounds just like ‘night’ and ‘knight’ are written differently but sound the same in modern English. In some dialect of English (say older English), these two sound different.
I still believe you basically just take the Middle Chinese 廣韻 rhyme table and extract the rhyme index set and there you go. Add some register marks for the likes of Cantonese that split tones. For dialects that only merged tones like Mandarin, you don’t have to worry too much.
PWH, although it may work in common spoken language, I believe there are many areas where it would not be feasible. Technical language, specialised terminology, botany, Chinese medcine, etc.
One of the advantages of Chinese written language would disappear, the possibility to write in a very compact, abbreviated way. Try to apply your idea to the headlines of a daily newspaper.
Also what would happen to registered company names, there would be thousands of double, triple, quadruple entries in the register.