Need help with zhon?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

283 Stars 36 Forks MIT License 147 Commits 7 Opened issues


Constants used in Chinese text processing

Services available


Need anything else?

Contributors list

# 16,802
125 commits



.. image:: :target:

.. image:: :target:

Zhon is a Python library that provides constants commonly used in Chinese text processing.

  • Documentation:
  • GitHub:
  • Support:
  • Free software:
    MIT license 


Zhon's constants can be used in Chinese text processing, for example:

  • Find CJK characters in a string:

.. code:: python

>>> re.findall('[{}]'.format(zhon.hanzi.characters), 'I broke a plate: 我打破了一个盘子.')
['我', '打', '破', '了', '一', '个', '盘', '子']
  • Validate Pinyin syllables, words, or sentences:

.. code:: python

>>> re.findall(zhon.pinyin.syllable, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I)
['Yuàn', 'zi', 'lǐ', 'tíng', 'zhe', 'yí', 'liàng', 'chē']

>>> re.findall(zhon.pinyin.word, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I) ['Yuànzi', 'lǐ', 'tíngzhe', 'yí', 'liàng', 'chē']

>>> re.findall(zhon.pinyin.sentence, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I) ['Yuànzi lǐ tíngzhe yí liàng chē.']


  • Includes commonly-used constants:
    • CJK characters and radicals
    • Chinese punctuation marks
    • Chinese sentence regular expression pattern
    • Pinyin vowels, consonants, lowercase, uppercase, and punctuation
    • Pinyin syllable, word, and sentence regular expression patterns
    • Zhuyin characters and marks
    • Zhuyin syllable regular expression pattern
    • CC-CEDICT characters
  • Runs on Python 2.7 and 3

Getting Started

  • Install Zhon 
  • Read
    Zhon's introduction 
  • Learn from the
    API documentation 
  • Contribute 
    _ documentation, code, or feedback

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.