Need help with sparser?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

stanford-futuredata
416 Stars 47 Forks BSD 3-Clause "New" or "Revised" License 285 Commits 5 Opened issues

Description

Sparser: Raw Filtering for Faster Analytics over Raw Data

Services available

!
?

Need anything else?

Contributors list

sparser

This code base implements Sparser, raw filtering for faster analytics over raw data. Sparser can parse JSON, Avro, and Parquet data up to 22x faster than the state of the art. For more details, check out our paper published at VLDB 2018.

See the

demo-repl
directory for a brief example. To run it:
# update rapidjson submodule
git submodule init
git submodule update
cd demo-repl
make
./bench /path/to/large/file.json

Then enter

1
at the
Sparser>
prompt.

Sparser itself is just a header file and only depends on standard C libraries available on most systems.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.