clickhouse-bulk

by nikepan

Collects many small inserts to ClickHouse and send in big inserts

222 Stars 42 Forks Last release: 6 months ago (v1.2.6) Apache License 2.0 99 Commits 26 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

ClickHouse-Bulk

Build Status codecov download binaries Go Report Card godoc

Simple Yandex ClickHouse insert collector. It collect requests and send to ClickHouse servers.

Installation

Download binary for you platorm

or

Use docker image

or from sources (Go 1.13+):

git clone https://github.com/nikepan/clickhouse-bulk
cd clickhouse-bulk
go build

Features

  • Group n requests and send to any of ClickHouse server
  • Sending collected data by interval
  • Tested with VALUES, TabSeparated formats
  • Supports many servers to send
  • Supports query in query parameters and in body
  • Supports other query parameters like username, password, database
  • Supports basic authentication

For example:

sql
INSERT INTO table3 (c1, c2, c3) VALUES ('v1', 'v2', 'v3')
INSERT INTO table3 (c1, c2, c3) VALUES ('v4', 'v5', 'v6')
sends as
sql
INSERT INTO table3 (c1, c2, c3) VALUES ('v1', 'v2', 'v3')('v4', 'v5', 'v6')

Options

  • -config - config file (json); default config.json

Configuration file

{
  "listen": ":8124", 
  "flush_count": 10000, // check by \n char
  "flush_interval": 1000, // milliseconds
  "dump_check_interval": 300, // interval for try to send dumps (seconds); -1 to disable
  "debug": false, // log incoming requests
  "dump_dir": "dumps", // directory for dump unsended data (if clickhouse errors)
  "clickhouse": {
    "down_timeout": 60, // wait if server in down (seconds)
    "connect_timeout": 10, // wait for server connect (seconds)
    "servers": [
      "http://127.0.0.1:8123"
    ]
  }
}

Environment variables (used for docker image)

  • CLICKHOUSE_SERVERS
    - comma separated list of servers
  • CLICKHOUSE_FLUSH_COUNT
    - count of rows for insert
  • CLICKHOUSE_FLUSH_INTERVAL
    - insert interval
  • DUMP_CHECK_INTERVAL
    - interval of resend dumps
  • CLICKHOUSE_DOWN_TIMEOUT
    - wait time if server is down
  • CLICKHOUSE_CONNECT_TIMEOUT
    - clickhouse server connect timeout

Quickstart

./clickhouse-bulk
and send queries to :8124

Metrics

manual check main metrics

curl -s http://127.0.0.1:8124/metrics | grep "^ch_"
*
ch_bad_servers 0
- actual count of bad servers *
ch_dump_count 0
- dumps saved from launch *
ch_queued_dumps 0
- actual dump files id directory *
ch_good_servers 1
- actual good servers count *
ch_received_count 40
- received requests count from launch *
ch_sent_count 1
- sent request count from launch

Tips

For better performance words FORMAT and VALUES must be uppercase.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.