Need help with robobrowser?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

jmcarp
3.5K Stars 337 Forks BSD 3-Clause "New" or "Revised" License 103 Commits 57 Opened issues

Services available

!
?

Need anything else?

Contributors list

# 10,453
Django
apache
Flask
Terrafo...
80 commits
# 127,696
CSS
HTML
Databas...
React N...
4 commits
# 53,607
rpg-eng...
Shell
browser...
golang
2 commits
# 44,617
C++
Common ...
solitai...
bittorr...
1 commit
# 295,112
Python
1 commit
# 110,099
Shell
PHP
Linux
HTML
1 commit
# 32,144
JavaScr...
CSS
Android
Shell
1 commit
# 256,762
HTML
Shell
CSS
1 commit
# 250,089
C++
Linux
solaris
React N...
1 commit

RoboBrowser: Your friendly neighborhood web scraper

.. image:: https://badge.fury.io/py/robobrowser.png :target: http://badge.fury.io/py/robobrowser

.. image:: https://travis-ci.org/jmcarp/robobrowser.png?branch=master :target: https://travis-ci.org/jmcarp/robobrowser

.. image:: https://coveralls.io/repos/jmcarp/robobrowser/badge.png?branch=master :target: https://coveralls.io/r/jmcarp/robobrowser

Homepage:

http://robobrowser.readthedocs.org/ 
_

RoboBrowser is a simple, Pythonic library for browsing the web without a standalone web browser. RoboBrowser can fetch a page, click on links and buttons, and fill out and submit forms. If you need to interact with web services that don't have APIs, RoboBrowser can help.

.. code-block:: python

import re
from robobrowser import RoboBrowser

Browse to Genius

browser = RoboBrowser(history=True) browser.open('http://genius.com/')

Search for Porcupine Tree

form = browser.get_form(action='/search') form # form['q'].value = 'porcupine tree' browser.submit_form(form)

Look up the first song

songs = browser.select('.song_link') browser.follow_link(songs[0]) lyrics = browser.select('.lyrics') lyrics[0].text # \nHear the sound of music ...

Back to results page

browser.back()

Look up my favorite song

song_link = browser.get_link('trains') browser.follow_link(song_link)

Can also search HTML using regex patterns

lyrics = browser.find(class_=re.compile(r'\blyrics\b')) lyrics.text # \nTrain set and match spied under the blind...

RoboBrowser combines the best of two excellent Python libraries:

Requests 
_ and
BeautifulSoup 
_. RoboBrowser represents browser sessions using Requests and HTML responses using BeautifulSoup, transparently exposing methods of both libraries:

.. code-block:: python

import re
from robobrowser import RoboBrowser

browser = RoboBrowser(user_agent='a python robot') browser.open('https://github.com/')

Inspect the browser session

browser.session.cookies['_gh_sess'] # BAh7Bzo... browser.session.headers['User-Agent'] # a python robot

Search the parsed HTML

browser.select('div.teaser-icon') # [

# #
, # ... browser.find(class_=re.compile(r'column', re.I)) #

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.