Need help with arachne?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

kirankoduru
123 Stars 36 Forks Other 81 Commits 10 Opened issues

Description

A flask API for running your scrapy spiders

Services available

!
?

Need anything else?

Contributors list

# 71,730
Python
crawlin...
HTML
Shell
74 commits
# 170,423
scikit-...
JavaScr...
Shell
python3
1 commit

=======

Arachne

.. image:: https://travis-ci.org/kirankoduru/arachne.svg :target: https://travis-ci.org/kirankoduru/arachne

.. image:: https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github :target: https://coveralls.io/github/kirankoduru/arachne?branch=master

Arachne provides a wrapper around your scrapy

Spider
object to run them through a flask app. All you have to do is customize
SPIDER_SETTINGS
in the settings file.

Installation

You can install Arachne from pip

pip install Arachne

Sample settings

This is sample settings file for spiders in your project. The settings file should be called settings.py for Arachne to find it and looks like this::

# settings.py file
SPIDER_SETTINGS = [
    {
        'endpoint': 'dmoz',
        'location': 'spiders.DmozSpider',
        'spider': 'DmozSpider'    
    }
]

Usage

It looks very similar to a flask app but since Scrapy depends on the python twisted package, we need to run our flask app with twisted::

from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import Arachne

app = Arachne(name)

resource = WSGIResource(reactor, reactor.getThreadPool(), app) site = Site(resource) reactor.listenTCP(8080, site)

if name == 'main': reactor.run()

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.