Is cc-cedict / mdbg dead?

Over the past few months I have been collecting a list of words I have come across that aren’t in MDBG/CC-CEDICT, and submitting them to the web site.

I am a little confused about why not a single submission has gotten into the dictionary. It can’t be just a quality issue as there is so much crap in there already.

Are there any alternative dictionaries/sites that are a little more up to date?

[quote=“pqkdzrwt”]Over the past few months I have been collecting a list of words I have come across that aren’t in MDBG/CC-CEDICT, and submitting them to the web site.

I am a little confused about why not a single submission has gotten into the dictionary. It can’t be just a quality issue as there is so much crap in there already.

Are there any alternative dictionaries/sites that are a little more up to date?[/quote]
Hi,
first, i think MDBG is not “dead”, they just use the CEDICT file… next, i don’t know who is in charge of CEDICT. Who do you submit additions to CEDICT to? I have long-standing contact with the people looking after WWWJDIC, so i contribute there…

MDBG claims to “be” ccedict, i.e. when you click “Add word” in MDBG it takes you to a site called “cc-cedict”, and so thats where I am adding them.

I have never heard of this WWWJDIC, I will check it out.

Speaking as an active editor CC-CEDICT, I can tell you that rumours of its death are greatly exaggerated! :wink:

Changes made to the dictionary are published on a continuous basis at
cc-cedict.org/editor/editor.php? … istChanges

Over the past month alone, approximately 600 changes have been made. That includes the addition of new entries as well as corrections of existing entries.

Thanks for collecting a list of words not already in CC-CEDICT and submitting them. To tell you what has happened with your submissions, I would need to know your login name, or, if you submitted anonymously, the words you submitted. I will try to email you via Forumosa (which I have just joined, in order to reply to you here).

Regarding the quality of CC-CEDICT entries, there are still many entries in the dictionary which got there before the current editorial process was set up (a couple of years ago). Bad entries are progressively being updated and corrected.

Please understand that writing and editing good quality entries is very time-consuming, as it often involves research into examples of actual usage of a word, in addition to referencing multiple external dictionaries, as well as discussion amongst editors. As there is only a small group of editors, it can take some time for a submission to be processed. It may sit in a queue for a while. In fact, it’s a circular queue, and a submission may come up over and over, accruing editors’ comments until it is finally processed.

Many of the contributors (submitters) are based in Taiwan, and some of the editors of CC-CEDICT also have links to Taiwan. And, of course, CC-CEDICT is a dictionary which provides both traditional and simplified forms of words, and indicates where the pronunciation is different in Taiwan.

MDBG is not really the same as CC-CEDICT, although the two projects are closely linked. The owner of MDBG took on the task of setting up an editors’ website for CC-CEDICT at a time when CC-CEDICT’s predecessor, CEDICT, was floundering. He wants to see CC-CEDICT flourish, as it provides the dictionary for his website, but I believe he would be just as happy for some other well-qualified person to manage the CC-CEDICT community.

WWWJDIC is the interface for the EDICT Japanese–English dictionary project. EDICT was the inspiration for CEDICT when CEDICT started back in ~ 1998.

I was submitting them all anonymously for the past few months, but noticed nothing changed, so the three most recent once I submitted as a signed in user, and took not of them to check up later (:

The three most recent items I submitted were:
航空 (Ammend: to include definition Airline)
祢 (Add: new definition with new pronunciation to include definition as pronoun for God)
小年人 (Add: Term to describe young person)

That change log page looks really interesting, I was looking for something like that but couldn’t find it. Using it I can now see you rejected my first submission and the other two aren’t there. Well thats handy to know (:

NB, regarding the word 航空, I disagree with your conclusion because my submission was based on coming across a taiwanese news article in the paper that used the word 航空 in a sentence without appending 公司. I had to look it up in mdbg, and seeing 航空 translated as aviation did not make sense in the context of the sentence, and hence that is what lead me to have to do the google search search to discover they meant “airlines”, not “aviation”. See also the following where they don’t append 公司:

華信航空 mandarin-airlines.com/index.html
泰國航空 thaiairways.com.tw/

If you were to come across the above in reading a chinese news article it would be incorrect to read it has , “Tailand aviation” and “Mandarin aviation” would it not?

Thanks for your submissions, pqkdzrwt.

To see how your submissions have been processed, you can go to the webpage I quoted and enter your ID (fe****) in the User ID field.

So far, just one of the three words has been processed: 航空. The other two are still on the queue, invisible except to editors. One of them has a comment attached from one editor, and is awaiting further input from other editors. Both of those submissions have been on the queue for only a few days, which is not at all unusual.

“Airline” was not accepted as a meaning of 航空, because we take the view that, although 中華航空公司 is “China Airlines”, it’s really 航空公司 (literally “aviation company”) that means “airline” rather than 航空. In our view, 航空, as a word, essentially just means “aviation.”

In context, 航空 may be translated as “Airline” – or, very commonly, “Air”, as in the case of 大韓航空 (Korean Air), 印度航空 (Air India) etc. There is also “Airways” as in All Nippon Airways, Qantas Airways etc.

But CC-CEDICT sees its role as providing the underlying meaning of words, rather than a list of possible translation phrases. We leave that latter task to the likes of Google Translate, which incidentally, has both “Air” and “Airlines” amongst its list of translations for 航空. They could add “Airways”, too. But CEDICT is sticking to the meaning of the word – “aviation.”

[quote=“pqkdzrwt”]
NB, regarding the word 航空, I disagree with your conclusion because my submission was based on coming across a taiwanese news article in the paper that used the word 航空 in a sentence without appending 公司. I had to look it up in mdbg, and seeing 航空 translated as aviation did not make sense in the context of the sentence, and hence that is what lead me to have to do the google search search to discover they meant “airlines”, not “aviation”. See also the following where they don’t append 公司:

華信航空 mandarin-airlines.com/index.html
泰國航空 thaiairways.com.tw/

If you were to come across the above in reading a chinese news article it would be incorrect to read it has , “Tailand aviation” and “Mandarin aviation” would it not?[/quote]

華信航空 and 泰國航空 are proper nouns: they are the names of entities. If you want to know the correct name of an entity, it would be appropriate to look it up in a source like Wikipedia. But don’t expect a dictionary to necessarily provide the correct term to use in translating a proper noun word-by-word.

泰國航空 is best regarded as a single name, rather than two words that can be translated independently. Even if we did have “aviation/Air/Airways/Airlines” as the definition of 航空, that wouldn’t tell you the proper way of translating 泰國航空 – which of the four is it in this case: Air? Airways? Airlines? Aviation? As mentioned before, you need to look up the whole term – 泰國航空 – to find the answer, not rely on the meaning of the component words. As a matter of fact, CEDICT does contain many proper nouns, including

大韓航空 大韩航空 [Da4 Han2 Hang2 kong1] /Korean Air, South Korea’s main airline/
荷蘭皇家航空 荷兰皇家航空 [He2 lan2 Huang2 jia1 Hang2 kong1] /KLM Royal Dutch Airlines/
國航 国航 [Guo2 Hang2] /Air China/abbr. for 中國國際航空公司|中国国际航空公司[Zhong1 guo2 Guo2 ji4 Hang2 kong1 Gong1 si1]/
捷達航空貨運 捷达航空货运 [Jie2 da2 Hang2 kong1 Huo4 yun4] /Jett8 Airlines Cargo (based in Singapore)/
東方航空 东方航空 [Dong1 fang1 Hang2 kong1] /China Eastern Airlines/
etc.

If a user looks up 航空 in CC-CEDICT, these are some of the entries he will see, and this will give a good idea of how 航空 is translated in context. But I feel that dictionary users should know that, however its name is translated, the underlying meaning of the name of a company like 東方航空 is “eastern aviation”.

Consider the word 人心. CEDICT defines it as “popular feeling/the will of the people.” It’s fine to use “popular” in the translation of 人心 (popular feeling), but it would be probably be misleading to add “popular” to the definition of 人 or “will” to the definition of 心.

Similarly, 泰國航空 is Thai Airways, but 航空 doesn’t really mean “airlines”.

I understand the argument, and it makes sense, I was not aware edict was designed to work like this as it seems there are many entries in the dictionary that provide six or seven translations of a word rather than the underlying meaning. Does this mean there is a view to clean up existing words that have this problem?

I am more interested to see what happens with the 祢 submission. I was very surprised it was not already in the dictionary. It is tricky because it appears to be a simplified character, however in all the (traditional) online and printed material I have only 祢 is ever used. So I am not sure how you would handle that in cedict.

Some extra references in case its useful for the discussion.

Searching youtube (traditional titled and/or descriptioned videos) top hits for the word:
[ul]喜樂之泉19: Unconditional Love 祢愛永不變
有祢愛我
認識祢真好
留住祢
從前風聞有祢
[/ul]
Song lyrics search (Google: 祢 site:mojim.com)
[ul]
祢歌詞
有祢已經足夠歌詞Eternity Girls
我全心倚靠祢歌詞約書亞樂團[/ul]

This character has an essential meaning of “you”, used when addressing gods or deceased ancestors (in some cultures that would be the same); it is also used to denote a memorial tablet (you said you could not find a traditional form for it - try 禰 :wink: ). And then there are, of course, connotations, some of which appear in compounds like the ones you mentioned - connotations may be, but need not be, close to the essential meaning, though… that’s why we need human translators to make sense of human languages (machines don’t “get it”).

This character has an essential meaning of “you”, used when addressing gods or deceased ancestors (in some cultures that would be the same); it is also used to denote a memorial tablet (you said you could not find a traditional form for it - try 禰 :wink: ). And then there are, of course, connotations, some of which appear in compounds like the ones you mentioned - connotations may be, but need not be, close to the essential meaning, though… that’s why we need human translators to make sense of human languages (machines don’t “get it”).[/quote]

Minor miscommunication, I do not mean to suggest that 祢 does not have a traditional variant. What I am trying to say is that when it is used to mean “you” in taiwan, the traditional variant is never used. Hence in taiwan 禰 might refer to a tablet, but it does not refer to the meaning “you”. What I am suggesting is that I am not sure how this distinction would be indicated in cedict.

I see now what you mean. Yes, simplified versions of (many) characters have existed for much longer than the PRC :wink: , and there are cases where the usage of a character and the usage of its simplified version diverged over time.
In Japanese we have a parallel development, in that certain often used characters have become replaced by different (and simpler) characters that indicate only sounds - example: 御意見 -> ご意見 (in this context 御 is pronounced ご) and 御手伝い -> お手伝い (in this context 御 is pronounced お). And while some people may say the meaning of 御意見 and ご意見 is the same, others point out that although the denotative meanings are the same, certain connotations of the two expressions are different (depending on the context and the expressions under consideration, simplified variants may be taken as less formal, less reserved (friendlier), or more direct (up to aggressive) - as the case may be - than their non-simplified one counterparts)
To see an example:
go to csse.monash.edu.au/~jwb/cgi- … dic.cgi?1C and put in the keyword 御前
It is easy to represent such distinctions in any disctionary, though… :slight_smile:

[quote=“pqkdzrwt”]I understand the argument, and it makes sense, I was not aware edict was designed to work like this as it seems there are many entries in the dictionary that provide six or seven translations of a word rather than the underlying meaning. Does this mean there is a view to clean up existing words that have this problem?

I am more interested to see what happens with the 祢 submission. I was very surprised it was not already in the dictionary. It is tricky because it appears to be a simplified character, however in all the (traditional) online and printed material I have only 祢 is ever used. So I am not sure how you would handle that in cedict.

Some extra references in case its useful for the discussion.

[/quote]

I will put a link to this page alongside your submission. It’s helpful to have extra information such as examples of usage, and these can be pasted into the comments box at the time of submission.

Without seeing a specific example, I can’t comment on entries with 6 or 7 translations. It may be that it’s redundant, or it may be that there are simply that many different senses, or it may be that the word has essentially one meaning in Chinese, but is hard to express idiomatically in English in just one phrase, as in our definition for 修整:
cc-cedict.org/editor/editor.php? … atchtype=2

One thing I have learned is that the correct/normal character to use in the fanti context is not necessarily the most complicated variant. So 祢 may end up being processed similar to how you submitted it. I’m not sure at this point.

[quote=“pqkdzrwt”]I understand the argument, and it makes sense, I was not aware edict was designed to work like this as it seems there are many entries in the dictionary that provide six or seven translations of a word rather than the underlying meaning. Does this mean there is a view to clean up existing words that have this problem?
[/quote]

Any entries that have a problem should be fixed, of course. But the simple fact that some entries have six or seven parts in the definition does not necessarily indicate that there is a problem. There may be good reasons to have 6 or 7.

Regarding “I was not aware edict was designed to work like this” – in fact, most dictionaries work like this. It’s not a special feature of CEDICT. It’s Google Translate that takes the unconventional approach of attempting to give “translations” rather “meanings”. Here is what some other dictionaries say for 航空. None of them says it means “organization providing a regular passenger air service (airline)”.

Collins ~ fly
New Century ~ aviation
Contemporary Chinese Dictionary ~ 1 aviation; air navigation 2 related to planes or air transport
New Age ~ aviation
Oxford ~ aviation
WorldLingo ~ aviation; voyage
Nciku Contemporary Standard Chinese Dictionary ~ 1. 動 飛機等在大氣層中飛行。2. 形 有關航空的
MoE ~ 以飛機、飛船等飛行器載運人或物在空中飛行。