Please help check Pinyin input map

There is a very cool free input platform called the Smart Common Input Method being developed for the free software community that will put commercial input methods to shame. But the developers need help doing the pinyin mapping for traditional Chinese. The debian package is called scim. I will create a testing package for people to play with once we’ve checked it.

Please help me check the following map of initials and finals to pinyin. The code here maps hanyu pinyin initials and finals to a standard Zhuyin keyboard. So for example, pressing ‘b’ is equivalent tp pressing 1 on a standard traditional Chinese keyboard.

Please be very careful. This code will go out to to hundreds of thousands of users, so we need to be as accurate as possible. Please post your corrections to the SCIM-MAPon ForumsaWiki.

If you checked the chart and found no errors, please post a reply to this thread saying no errors.

static PinYingZuinMap* InitialsMap()
{
HANYU_INITIALS = 26;
static PinYingZuinMap map[ 26 ] = {
{“b” , “1”}, {“p” , “q”}, {“m” , “a”}, {“f” ,“z”},
{“d” , “2”}, {“t” , “w”}, {“n” , “s”}, {“l” ,“x”},
{“g” , “e”}, {“k” , “d”}, {“h” , “c”},
{“j” , “r”}, {“q” , “f”}, {“x” , “v”},
{“zhi”, “5”}, {“zh”, “5”}, {“chi”, “t”}, {“ch”, “t”},
{“shi”, “g”}, {“sh”, “g”}, {“ri”, “b”}, {“r” ,“b”},
{“z” , “y”}, {“c” , “h”}, {“s” , “n”}
};
return map;
}

static PinYingZuinMap* FinalsMap()
{
HANYU_FINALS = 72;
static PinYingZuinMap map[ 72 ] = {
{“uang”,“j;”}, {“wang”,“j;”},
{“wong”,“j/”}, {“weng”,“j/”},
{“ying”,“u/”},
{“iong”,“m/”}, {“yong”,“m/”}, {“iung”,“m/”}, {“yung”,“m/”},
{“iang”,“u;”}, {“yang”,“u;”},
{“iuan”,“m0”}, {“yuan”,“m0”},
{“ing”,“u/”},
{“iao”,“ul”}, {“yao”,“ul”},
{“iun”,“mp”}, {“yun”,“mp”},
{“iou”,“u.”}, {“you”,“u.”},
{“ian”,“u0”}, {“yan”,“u0”},
{“yin”,“up”},
{“ang”,";"},
{“eng”,"/"},
{“iue”,“m,”}, {“yue”,“m,”},
{“uai”,“j9”}, {“wai”,“j9”},
{“uei”,“jo”}, {“wei”,“jo”},
{“uan”,“j0”}, {“wan”,“j0”},
{“uen”,“jp”}, {“wen”,“jp”},
{“ong”,“j/”},
{“van”,“m0”},
{“ven”,“mp”},
{“er”,"-"},
{“ai”,“9”},
{“ei”,“o”},
{“ao”,“l”},
{“ou”,"."},
{“an”,“0”},
{“en”,“p”},
{“yi”,“u”},
{“ia”,“u8”}, {“ya”,“u8”},
{“ie”,“u,”}, {“ye”,“u,”},
{“in”,“up”},
{“io”,“u.”},
{“wu”,“j”},
{“ua”,“j8”}, {“wa”,“j8”},
{“uo”,“ji”}, {“wo”,“ji”},
{“ui”,“jo”},
{“iu”,“m”}, {“yu”,“m”},
{“ue”,“m,”}, {“ve”,“m,”},
{“un”,“mp”}, {“vn”,“mp”},
{“a”,“8”},
{“e”,“k”},
{“i”,“u”},
{“o”,“i”},
{“v”,“m”},
{“u”,“j”},
{“E”,","}
};
return map;

My eyes, aaah…

I don’t know how it is going to work in the end, but I think you are having a problem with
{“j” , “r”},
{“iu”,“m”}, {“yu”,“m”},

“jiu” becomes “ju” if I follow map.

Some of the hanyu pinyin doesn’t seem to be standard hanyu pinyin

iung
iuan
iun
iou
iue
uen
van
ven
io

I am sorry I don’t have more time right now, but this needs a revision especially the handling of

You might want to check your sounds against this table.
geocities.com/hao520/researc … n-xref.htm

I’ve used it before for various Chinese Phonetic purposes, and it’s proved quite useful. Good luck with your project.

No errors detected.

My pinyin’s not detailed enought o knowif it’s all standard pinyin, but it all maps coreectly.

Wouldn’t you also need to add:

{“1” , “space bar”}, {“2” , “6”}, {“3” , “3”}, {“4” ,“4”}

or something.

Brian

In the simplified package there is an option to choose traditional characters, thus getting pinyin+traditional. Maybe you can doublecheck against that as well?

YingFan

[quote=“YingFan”]In the simplified package there is an option to choose traditional characters, thus getting pinyin+traditional. Maybe you can doublecheck against that as well?
YingFan[/quote]

You are talking about Smart Pinyin in the simplified package, right? How do use it? For example, how would I type Taiwan?
I type tai2 fine, but when I type wan1 I don’t get the write character. I have the traditional character option enabled.

Thanks

I’m not at my computer now, will check tonight. One thing to check though is that you are not getting the 2nd choise and then 1st choise (as opposed to 2nd and 1st tone resp.)

Try shift before the number to specify tones as I believe the matching otherwise is toneless. Maybe its configurable though.

YingFan

[quote=“rice_t”]My eyes, aaah…

I don’t know how it is going to work in the end, but I think you are having a problem with
{“j” , “r”},
{“iu”,“m”}, {“yu”,“m”},

“jiu” becomes “ju” if I follow map.

)[/quote]

Rice is right. It’s because of “io”. If you could write “wine” as “jio” it would be OK.

There’s no provision for “gui” expensive, and I think “E” is redundant.

Some subset of the Wade Giles system seems to be represented along with Pinyin (I think that’s what it is: it certainly isn’t Tongyong or Yale)

Isn’t this a slightly strange way of doing it though? What’s what happens to be etched on the keys of some users got to do with Pinyin input? You imply there’s a simplified character version as well: does that use this peculiar bpmf mapping thing as well???

I tried SCIM a while back, and though it was good in many ways, I didn’t like the Hanyu Pinyin input (at least when typing traditional characters - it worked OK for simplified). The problem was that it lacked the ability to accept tones. For example, to type “Taipei”, I would have liked to type:

tai2 (choose a character)
bei3 (choose a character)

Instead, I had to do this without tones, leaving me with a much larger number of characters to sift through.

Another method on offer was the so-called “smart pinyin” which I think sucks. I’d rather have plain old “stupid” pinyin.

So I’m just wondering (hoping) that this will be fixed (if it hasn’t already). That’s all I would need to make SCIM useful to me.

  • DB

[quote=“smithsgj”]
Isn’t this a slightly strange way of doing it though? What’s what happens to be etched on the keys of some users got to do with Pinyin input? You imply there’s a simplified character version as well: does that use this peculiar bpmf mapping thing as well???[/quote]

I agree. What the heck is going on here? I’ve been watching this thread for the last few days and I was hoping to see someone write about it who knows more about these things than me.

If you’re trying to map bopomofo to pinyin to characters, I think the best thing to do is just make a table of all the possible sounds in the two phonetic systems (not worrying about how to combine initials and finals). This method is more straightforward. I think xcin does things this way.

If you’re trying to do anything else, I have no clue what is going on.

Ok guys, I think I join in if it’s not too late already… :sunglasses:

First of all an announcement: If you guys are in Taipei and interested in Open Source Software, there is a group called TOSSUG which meet every Tuesday evening at around 19:00 in the coffee shop in NTU Sports Center (XinSheng South Rd Sec. 3 / XinHai Rd.).

The XCIN author and also some SCIM contributers (me being one of them) are usually there.

Website is wiki.tossug.org

For the pinyin typing:
The SCIM-Chinese (or Simple-Pinyin) method contains a database which chooses the characters while you are typing. It works for both simplified and traditional chinese, but simplified is better supported, as it is a product developed in China.
You can configure it to use tones or not. I usually use it without tones and usually it chooses the correct (but simplefied) characters for me. For traditional characters I have to choose them myself. But once i do so, they are recorded and the next time I type the same sequence, I’ll get this selection as first candidate.

Then there is the SCIM-CHEWING module for traditional chinese, also has a database and uses zhuyin (bopomofo) by default. However you can configure it to use pinyin instead. Unfortunateley this function is currently broken. But the author of this method sometimes shows up at the TOSSIG meetings… (see above).

I myself am working on a Taiwanese and Hakka input method… :slight_smile:

Cheers
Arne