Need help with presto-ethereum?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

xiaoyao1991
442 Stars 51 Forks Apache License 2.0 38 Commits 15 Opened issues

Description

Presto Ethereum Connector -- SQL on Ethereum

Services available

!
?

Need anything else?

Contributors list

Presto Ethereum Connector

Unleash the Power of Presto Interactive SQL Querying on Ethereum Blockchain

Introduction

Presto is a powerful interactive querying engine that enables running SQL queries on anything -- be it MySQL, HDFS, local file, Kafka -- as long as there exist a connector to the source.

This is a Presto connector to the Ethereum blockchain data. With this connector, one can get hands on with Ethereum blockchain analytics work without having to know how to play with the nitty gritty Javascript API.

Prerequisites

Have an Ethereum client that you can connect to. There are 2 options: 1. Run Geth or Parity locally. 1. Use Infura, a hosted Ethereum client in the cloud.

Note

Specify a block range where you can (e.g.

WHERE block.block_number > x AND block.block_number < y
, or
WHERE transaction.tx_blocknumber > x AND transaction.tx_blocknumber < y
, or
WHERE erc20.erc20_blocknumber > x AND erc20.erc20_blocknumber < y
). Block number is the default and only predicate that can push down to narrow down data scan range. Queries without block ranges will cause presto to retrieve blocks all the way from the first block, which takes forever.

Usage

  1. Install Presto. Follow the instructions on that page to create relevant config files.
    By the end of this step, your presto installation folder structure should look like:
    ├── bin
    ├── lib
    ├── etc
    │   ├── config.properties
    │   ├── jvm.config
    │   └── node.properties
    ├── plugin
    
  2. Install Presto CLI
  3. Clone this repo and run
    mvn clean package
    to build the plugin. You will find the built plugin in the
    target
    folder.
  4. Load the plugin to Presto
    a. Create the ethereum connector config inside of

    etc
    .
    $ mkdir -p etc/catalog && touch etc/catalog/ethereum.properties

    Paste the following to the ethereum.properties: ``` connector.name=ethereum

    You can connect through Ethereum HTTP JSON RPC endpoint

    IMPORTANT - for local testing start geth with rpcport

    geth --rpc --rpcaddr "127.0.0.1" --rpcport "8545"

    ethereum.jsonrpc=http://localhost:8545/

    Or you can connect through IPC socket

    ethereum.ipc=/path/to/ipc_socketfile

    Or you can connect to Infura

    ethereum.infura=https://mainnet.infura.io/

    b. Copy and extract the built plugin to your presto plugin folder  
    

    $ mkdir -p plugin/ethereum \ && cp /target/presto-ethereum--plugin.tar.gz . \ && tar xfz presto-ethereum--plugin.tar.gz -C plugin/ethereum --strip-components=1 ```

    By the end of this step, your presto installation folder structure should look like:

      ├── bin
      ├── lib
      ├── etc
      │   ├── catalog
      │   │   └── ethereum.properties
      │   ├── config.properties
      │   ├── jvm.config
      │   └── node.properties
      ├── plugin
      │   ├── ethereum
      │   │   └── 
    
  5. There you go. You can now start the presto server, and query through presto-cli:

    $ bin/launcher start
    $ presto-cli --server localhost:8080 --catalog ethereum --schema default
    

Use Cases

Inspired by An Analysis of the First 100000 Blocks, the following SQL queries capture partially what was depicted in that post.

  • The first 50 block times (in seconds)
    sql
    SELECT b.bn, (b.block_timestamp - a.block_timestamp) AS delta
    FROM
    (SELECT block_number AS bn, block_timestamp
    FROM block
    WHERE block_number>=1 AND block_number<=50) AS a
    JOIN
    (SELECT (block_number-1) AS bn, block_timestamp
    FROM block
    WHERE block_number>=2 AND block_number<=51) AS b
    ON a.bn=b.bn
    ORDER BY b.bn;
    
  • Average block time (every 200th block from genesis to block 10000)
    sql
    WITH
    X AS (SELECT b.bn, (b.block_timestamp - a.block_timestamp) AS delta
        FROM
            (SELECT block_number AS bn, block_timestamp
            FROM block
            WHERE block_number>=1 AND block_number<=10000) AS a
        JOIN
            (SELECT (block_number-1) AS bn, block_timestamp
            FROM block
            WHERE block_number>=2 AND block_number<=10001) AS b
        ON a.bn=b.bn
        ORDER BY b.bn)
    SELECT min(bn) AS chunkStart, avg(delta)
    FROM
    (SELECT ntile(10000/200) OVER (ORDER BY bn) AS chunk, * FROM X) AS T
    GROUP BY chunk
    ORDER BY chunkStart;
    
  • Biggest miners in first 100k blocks (address, blocks, %)
    sql
    SELECT block_miner, count(*) AS num, count(*)/100000.0 AS PERCENT
    FROM block
    WHERE block_number<=100000
    GROUP BY block_miner
    ORDER BY num DESC
    LIMIT 15;
    
  • ERC20 Token Movement in the last 100 blocks
    sql
    SELECT erc20_token, SUM(erc20_value) FROM erc20
    WHERE erc20_blocknumber >= 4147340 AND erc20_blocknumber<=4147350
    GROUP BY erc20_token;
    
  • Describe the database structure ```sql SHOW TABLES; Table ------------- block erc20 transaction

DESCRIBE block;

Column | Type | Extra | Comment

blocknumber | bigint | | blockhash | varchar(66) | | blockparenthash | varchar(66) | | blocknonce | varchar(18) | | blocksha3uncles | varchar(66) | | blocklogsbloom | varchar(514) | | blocktransactionsroot | varchar(66) | | blockstateroot | varchar(66) | | blockminer | varchar(42) | | blockdifficulty | bigint | | blocktotaldifficulty | bigint | | blocksize | integer | | blockextradata | varchar | | blockgaslimit | double | | blockgasused | double | | blocktimestamp | bigint | | blocktransactions | array(varchar(66)) | | blockuncles | array(varchar(66)) | |

DESCRIBE transaction;

Column | Type | Extra | Comment

txhash | varchar(66) | | txnonce | bigint | | txblockhash | varchar(66) | | txblocknumber | bigint | | txtransactionindex | integer | | txfrom | varchar(42) | | txto | varchar(42) | | txvalue | double | | txgas | double | | txgasprice | double | | tx_input | varchar | |

DESCRIBE erc20; Column | Type | Extra | Comment -------------------+-------------+-------+--------- erc20token | varchar | | erc20from | varchar(42) | | erc20to | varchar(42) | | erc20value | double | | erc20txhash | varchar(66) | | erc20blocknumber | bigint | | ```

Web3 Functions

In addition to the various built-in Presto functions, some web3 functions are ported so that they can be called inline with SQL statements directly. Currently, the supported web3 functions are 1. fromWei 1. toWei 1. eth_gasPrice 1. eth_blockNumber 1. eth_getBalance 1. eth_getTransactionCount

Troubleshooting

  • You must use python2. You will get invalid syntax errors if you use Python3.

    -> bin/launcher start
    File "/your_path/presto-server-0.196/bin/launcher.py", line 38
    except OSError, e:
                  ^
    SyntaxError: invalid syntax
    
  • Use Java 8 only. You might get the following errors if you use the wrong Java version.

Unrecognized VM option 'ExitOnOutOfMemoryError'
Did you mean 'OnOutOfMemoryError='?
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.