Tools, wrappers, etc... for data science with a concentration on text processing
Tools for data science with a focus on text processing.
Check out the master branch from the rosettarepo. Then, (so long as you have
cd rosetta make make test
If you update the source, you can do
make reinstall make test
pip, so you can of course do
pip uninstallat any time.
Getting the source (above) is the preferred method since the code changes often, but if you don't use Git you can download a tagged release (tarball) here. Then
pip install rosetta-X.X.X.tar.gz
You can get the latest sources with
git clone git://github.com/columbia-applied-data-science/rosetta
Feel free to contribute a bug report or a request by opening an issue
The preferred method to contribute is to fork and send a pull request. Before doing this, read CONTRIBUTING.md
From the base repo directory,
rosetta/, you can run all tests with
Documentation for releases is hosted at pypi. This does NOT auto-update.
Rosetta refers to the Rosetta Stone, the ancient Egyptian tablet discovered just over 200 years ago. The tablet contained fragmented text in three different languages and the uncovering of its meaning is considered an essential key to our understanding of Ancient Egyptian civilization. We would like this project to provide individuals the necessary tools to process and unearth insight in the ever-growing volumes of textual data of today.