Need help with spookystuff?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

tribbloid
133 Stars 34 Forks Apache License 2.0 1.7K Commits 28 Opened issues

Description

Scalable query engine for web scrapping/data mashup/acceptance QA, powered by Apache Spark

Services available

!
?

Need anything else?

Contributors list

# 165,523
Scala
Shell
Jupyter...
Apache ...
1604 commits
# 714,798
CSS
JavaScr...
Shell
2 commits
# 3,358
imagema...
sass-fr...
splash
pipelin...
2 commits
# 99,683
Scala
aws-lam...
Apache ...
lambda
2 commits
# 197,871
Jupyter...
Shell
c-sharp
forex
1 commit

Latest doc already moved to:

http://tribbloid.github.io/spookystuff/

SpookyStuff

... is a scalable query engine for web scraping/data integration/acceptance QA. The goal is to allow the Web being queried and ETL'ed like a relational database.

SpookyStuff is the fastest big data collection engine in history, with a speed record of querying 330404 dynamic pages per hour on 300 cores.

Build Status

| branch \ profile | scala-2.11 | scala-2.12 | | ---------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | master | Codeship Status for tribbloid/spookystuff | CI |

Join the chat at https://gitter.im/tribbloid/spookystuff

SpookyStuff-UAV (alpha component)

... allows the same engine to be used to control a swarm of aerial robots for photogrammetry and data acquisition. It is still a work in progress, please refer to this proposal for a feature and implementation overview.

Build Status

| branch \ profile | scala-2.11 | scala-2.12 | | ---------------- | ------------------------------------------------------------ | ---------- | | master | Build Status | - |

Join the chat at https://gitter.im/spookystuff-UAV/Lobby

Powered by

| | | | | | ---- | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | Core | Apache Spark
Apache Spark | Apache Maven
Apache Maven | JSoup
JSoup | | | Apache Tika
Apache Tika | | | | Web | Yourkit Java Profiler
Yourkit | PhantomJS/GhostDriver
PhantomJS | Selenium
Selenium | | UAV | MAVLink
MAVLink | | |

License

Copyright © 2014 by Peng Cheng @tribbloid, Sandeep Singh @techaddict, Terry Lin @ithinkicancode, Long Yao @l2yao and contributors.

Published under ASF License, see LICENSE.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.