Need help with weixin_public_corpus?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

nonamestreet
460 Stars 145 Forks 19 Commits 1 Opened issues

Description

微信公众号语料库

Services available

!
?

Need anything else?

Contributors list

# 108,155
linguis...
corpora
7 commits

微信公众号语料库

部分网络抓取的微信公众号的文章,已经去除HTML,只包含了纯文本。每行一篇,是JSON格式,name是微信公众号名字,account是微信公众号ID,title是题目,content是正文。

数据用zip分卷压缩过的, 没有密码。预览可以看preview.json。

目前数据大约3G,数据会定期更新增加。

请只用于研究用途。

有问题或者特殊需求直接建Issue。

[email protected]

欢迎志同道合的小伙伴加入校宝一起来搞有意思的事情!https://www.xiaobaoonline.com/pc/contactjoin

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.