Need help with tumblr_spider?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

facert
459 Stars 177 Forks MIT License 11 Commits 4 Opened issues

Description

汤不热 python 多线程爬虫

Services available

!
?

Need anything else?

Contributors list

# 29,068
HTML
transla...
Python
GitHub
9 commits
# 248,288
Python
tumblr
spider
1 commit

tumblr_spider is being sponsored by the following tool; please help to support us by taking a look and signing up to a free trial.

tumblr_spider

汤不热 python 多线程爬虫

install

pip install -r requirements.txt

run

python tumblr.py username (usename 为任意一个热门博主的 usename)

snapshoot

爬取结果

user.txt
是爬取的博主用户名结果,
source.txt
是视频地址集

原理

根据一个热门博主的 usename, 脚本自动会获取博主转过文章的其他博主的 username,并放入爬取队列中,递归爬取。

申明

这是一个正经的爬虫(严肃脸),爬取的资源跟你第一个填入的 username 有很大关系,另外由于某些原因,导致 tumblr 被墙,所以最简单的方式就是用国外 vps 去跑。

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.