presto-python-client

by prestodb

Python DB-API client for Presto

140 Stars 57 Forks Last release: over 1 year ago (0.7.0) Apache License 2.0 74 Commits 10 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Build Status

Introduction

This package provides a client interface to query Presto a distributed SQL engine. It supports Python 2.7, 3.5, 3.6, 3.7, and pypy.

Installation

$ pip install presto-python-client

Quick Start

Use the DBAPI interface to query Presto:

import prestodb
conn=prestodb.dbapi.connect(
    host='localhost',
    port=8080,
    user='the-user',
    catalog='the-catalog',
    schema='the-schema',
)
cur = conn.cursor()
cur.execute('SELECT * FROM system.runtime.nodes')
rows = cur.fetchall()

This will query the

system.runtime.nodes
system tables that shows the nodes in the Presto cluster.

The DBAPI implementation in

prestodb.dbapi
provides methods to retrieve fewer rows for example
Cursorfetchone()
or
Cursor.fetchmany()
. By default
Cursor.fetchmany()
fetches one row. Please set
prestodb.dbapi.Cursor.arraysize
accordingly.

Basic Authentication

The

BasicAuthentication
class can be used to connect to a LDAP-configured Presto cluster:
python
import prestodb
conn=prestodb.dbapi.connect(
    host='coordinator url',
    port=8443,
    user='the-user',
    catalog='the-catalog',
    schema='the-schema',
    http_scheme='https',
    auth=prestodb.auth.BasicAuthentication("principal id", "password"),
)
cur = conn.cursor()
cur.execute('SELECT * FROM system.runtime.nodes')
rows = cur.fetchall()

Oauth Authentication

To enable GCS access, Oauth authentication support is added by passing in a

shadow.json
file of a service account. Following example shows a use case where both Kerberos and Oauth authentication are enabled.
import getpass
import prestodb
from prestodb.client import PrestoRequest, PrestoQuery
from requests_kerberos import DISABLED

kerberos_auth = prestodb.auth.KerberosAuthentication( mutual_authentication=DISABLED, service_name='kerberos service name', force_preemptive=True, hostname_override='example.com' )

req = PrestoRequest( host='GCP coordinator url', port=443, user=getpass.getuser(), service_account_file='Service account json file path', http_scheme='https', auth=kerberos_auth )

query = PrestoQuery(req, "SELECT * FROM system.runtime.nodes") rows = list(query.execute())

Transactions

The client runs by default in autocommit mode. To enable transactions, set isolation_level to a value different than

IsolationLevel.AUTOCOMMIT
:
import prestodb
from prestodb import transaction
with prestodb.dbapi.connect(
    host='localhost',
    port=8080,
    user='the-user',
    catalog='the-catalog',
    schema='the-schema',
    isolation_level=transaction.IsolationLevel.REPEATABLE_READ,
) as conn:
  cur = conn.cursor()
  cur.execute('INSERT INTO sometable VALUES (1, 2, 3)')
  cur.execute('INSERT INTO sometable VALUES (4, 5, 6)')

The transaction is created when the first SQL statement is executed.

prestodb.dbapi.Connection.commit()
will be automatically called when the code exits the with context and the queries succeed, otherwise `prestodb.dbapi.Connection.rollback()' will be called.

Running Tests

There is a helper scripts,

run
, that provides commands to run tests. Type
./run tests
to run both unit and integration tests.

presto-python-client
uses pytest for its tests. To run only unit tests, type:
$ pytest tests

Then you can pass options like

--pdb
or anything supported by
pytest --help
.

To run the tests with different versions of Python in managed virtualenvs, use

tox
(see the configuration in
tox.ini
):
$ tox

To run integration tests:

$ pytest integration_tests

They build a

Docker
image and then run a container with a Presto server: - the image is named
presto-server:${PRESTO_VERSION}
- the container is named
presto-python-client-tests-{uuid4()[:7]}

The container is expected to be removed after the tests are finished.

Please refer to the

Dockerfile
for details. You will find the configuration in
etc/
.

You can use

./run
to manipulate the containers:
  • ./run presto_server
    : build and run Presto in a container
  • ./run presto_cli CONTAINER_ID
    : connect the Java Presto CLI to a container
  • ./run list
    : list the running containers
  • ./run clean
    : kill the containers

Development

Start by forking the repository and then modify the code in your fork. Please refer to

CONTRIBUTING.md
before submitting your contributions.

Clone the repository and go inside the code directory. Then you can get the version with

python setup.py --version
.

We recommend that you use

virtualenv
to develop on
presto-python-client
:
$ virtualenv /path/to/env
$ /path/to/env/bin/active
$ pip install -r requirements.txt

For development purpose, pip can reference the code you are modifying in a virtualenv:

$ pip install -e .[tests]

That way, you do not need to run

pip install
again to make your changes applied to the virtualenv.

When the code is ready, submit a Pull Request.

Need Help?

Feel free to create an issue as it make your request visible to other users and contributors.

If an interactive discussion would be better or if you just want to hangout and chat about the Presto Python client, you can join us on the #presto-python-client channel on Slack.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.