Geospatial Raster support for Spark DataFrames
RasterFrames® brings together Earth-observation (EO) data access, cloud computing, and DataFrame-based data science. The recent explosion of EO data from public and private satellite operators presents both a huge opportunity as well as a challenge to the data analysis community. It is Big Data in the truest sense, and its footprint is rapidly getting bigger.
RasterFrames provides a DataFrame-centric view over arbitrary raster data, enabling spatiotemporal queries, map algebra raster operations, and compatibility with the ecosystem of Spark ML algorithms. By using DataFrames as the core cognitive and compute data model, it is able to deliver these features in a form that is both accessible to general analysts and scalable along with the rapidly growing data footprint.
Please see the Getting Started section of the Users' Manual to start using RasterFrames.
RasterFrames is part of the LocationTech Stack.
It is written in Scala, but with Python bindings. If you wish to contribute to the development of RasterFrames, or you wish to build it from scratch, you will need sbt. Then clone the repository from GitHub.
git clone https://github.com/locationtech/rasterframes.git cd rasterframes
To publish to your local repository:
You can run tests with
and integration tests
The documentation may be built with
Additional, Python sepcific build instruction may be found at pyrasterframes/src/main/python/README.md
RasterFrames is released under the commercial-friendly Apache 2.0 License, copyright Astraea, Inc. 2017-2021.
As the sponsors and developers of RasterFrames, Astraea, Inc. is uniquely positioned to expand its capabilities. If you need additional functionality or just some architectural guidance to get your project off to the right start, we can provide a full range of consulting and development services around RasterFrames. We can be reached at [email protected].