Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Google Cloud Dataflow SDK for Python is based on Apache Beam and targeted for executing Python pipelines on Google Cloud Dataflow.
Google Cloud Dataflow for Python is now Apache Beam Python SDK and the code development moved to the Apache Beam repo.
If you want to contribute to the project (please do!) use this Apache Beam contributor's guide
We welcome all usage-related questions on Stack Overflow tagged with
Please use the issue tracker on Apache JIRA (sdk-py component) to report any bugs, comments or questions regarding SDK development.