Table of Contents
When I am learning a new word in Chinese, I usually look it up in Pleco and hit the
[+] button to create a flash card. Pleco is extremely useful for generating flash cards, but I prefer to use Anki, which lets me review those flash cards using spaced repetition.
Both Pleco and Anki have Android apps, but there is no native interface between them. Pleco exports a complex XML file with a lot of metadata. Anki expects to import a CSV file which maps to your custom note.
In Anki terminology, a note is a set of two or more fields:
# type: Note French: Bonjour English: Hello Page: 12
and a card is a view into that note via a card type:
# type: Card Type Q: French A: English<br> Page #Page
# type Card Q: Bonjour A: Hello Page #12
It is normal to have a few card types, which represent a few views into that note which you are trying to memorize.
Anki for Chinese⌗
My Chinese-language notecards have three fields:
# type: Note characters: 苹果 pinyin: <font color="green">píng</font><font color="blue">guǒ</font> meaning: noun apple
characters field contains CJK unicode, and the
pinyin field is HTML. Anki will render these fields correctly on web and mobile.
I have four card types:
Q: characters + meaning A: pinyin Q: characters + pinyin A: meaning Q: pinyin + meaning A: characters Q: characters A: pinyin + meaning
It doesn’t make sense to have a card where
Q: pinyin, since the question is ambiguous – there are many words and characters with the same pronunciation. I think it also doesn’t make sense to have a card where
Q: meaning, since there are many ways to say the same idea in each language.
In Pleco, a flash card looks like this:
Pleco can export your set of saved flash cards, but it does so as XML:
<?xml version="1.0" ?> <plecoflash formatversion="2" creator="Pleco User -1" generator="Pleco 2.0 Flashcard Exporter" platform="Android" created="1605883885"> <categories/> <cards> <card language="chinese"> <entry> <headword charset="sc">感冒</headword> <headword charset="tc">感冒</headword> <pron type="hypy" tones="numbers">gan3mao4</pron> <defn>noun common cold verb 1 catch cold 2 dialect be interested in; like (usu. used in the negative)</defn> </entry> <dictref dictid="PACE" entryid="21428224"/> </card> </cards> </plecoflash>
<card> entry is a rich object with (1)
sc (simplified Chinese) and
tc (traditional Chinese) characters, (2) a pinyin string in which the numeral following a syllable denotes its tone, and (3) a dictionary definition.
Anki, on the other hand, prefers to import data as a CSV (comma-separated or semicolon-separated is fine), where each columns maps to a field in the destination note. The above example as CSV might look like:
<span><font color="blue">găn</font></span> <span><font color="purple">mào</font></span>;感冒;noun common cold verb 1 catch cold 2 dialect be interested in.
This might render in Anki like so:
I wrote pleco-to-anki, a Python script which converts an XML file with a
plecoflash object to a CSV file suitable for import.
- I want to test myself on a character’s tones, but I don’t want to use the style of pinyin in which tones have numbers. I prefer to read the tones as diacritics over vowels, i.e. the pinyin for 苹果 should appear as píngguǒ, not ping2guo3.
- I like the convention where syllables are colored based on their tone. Everyone has their own convention, but for me red=flat, green=rising, blue=u-shaped, purple=falling, and grey=neutral. I want these flash cards to have colored pinyin syllables, but not colored Chinese characters.
üare two different vowels in pinyin and the difference must be preserved. For example, 旅游 can be written as “lǚyóu”: note the diaeresis (¨) and háček (ˇ) over the letter ‘u’.
- The dictionary definition which Pleco provides often repeats the Chinese characters: it is unsuitable as a question if it contains or hints at the answer.