How Can We Help?

Jieba Text Segmentation

Jieba Text Segmentation

Jieba is a simple text segmentation library used to segment a text into words. The words can be more easily processed. For example, you can determine the main purpose of a paragraph based on words that appear frequently or understand the trends by analyzing the words that appear frequently over the years.

import jieba

import jieba.analyse

text = ”’



# 全模式

seg_list = jieba.cut(text, cut_all=True)

print(u”full mode: “, “/ “.join(seg_list))

# 精确模式

seg_list = jieba.cut(text, cut_all=False)

print (u”accurate mode: “, “/ “.join(seg_list))

# 搜索引擎模式

seg_list = jieba.cut_for_search(text)

print(u”search engine mode: “, “/ “.join(seg_list))


tags = jieba.analyse.extract_tags(text, topK=3)

print(u”keywords: “, “/ “.join(tags))

Note: This example program is translated from the Chinese version. You can try to compile an English one by using English APIs.