Need help with spookystuff?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

tribbloid
131 Stars 34 Forks Apache License 2.0 1.7K Commits 25 Opened issues

Description

Scalable query engine for web scrapping/data mashup/acceptance QA, powered by Apache Spark

Services available

!
?

Need anything else?

Contributors list

# 160,525
Scala
Shell
Jupyter...
Apache ...
1594 commits
# 12,880
Scala
aws-lam...
Apache ...
lambda
2 commits
# 3,203
imagema...
sass-fr...
splash
pipelin...
2 commits
# 718,272
HTML
CSS
Shell
2 commits
# 198,393
Jupyter...
Shell
c-sharp
forex
1 commit

Latest doc already moved to:

http://tribbloid.github.io/spookystuff/

SpookyStuff

... is a scalable query engine for web scraping/data integration/acceptance QA. The goal is to allow the Web being queried and ETL'ed like a relational database.

SpookyStuff is the fastest big data collection engine in history, with a speed record of querying 330404 dynamic pages per hour on 300 cores.

Build Status

| branch \ profile | scala-2.11 | scala-2.12 |---|---|---| | master | Codeship Status for tribbloid/spookystuff | CI |

Join the chat at https://gitter.im/tribbloid/spookystuff

SpookyStuff-UAV (alpha component)

... allows the same engine to be used to control a swarm of aerial robots for photogrammetry and data acquisition. It is still a work in progress, please refer to this proposal for a feature and implementation overview.

Build Status

| branch \ profile | scala-2.11 | scala-2.12 |---|---|---| | master | Build Status | - |

Join the chat at https://gitter.im/spookystuff-UAV/Lobby

Powered by

  • Apache Spark
  • Selenium
  • JSoup
  • Apache Tika
  • Apache Maven
  • PhantomJS/GhostDriver
  • (UAV) MAVLink

Apache Spark Selenium Apache Tika Apache Maven PhantomJS MAVLink

License

Copyright © 2014 by Peng Cheng @tribbloid, Sandeep Singh @techaddict, Terry Lin @ithinkicancode, Long Yao @l2yao and contributors.

Published under ASF License, see LICENSE.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.