Ruby extension to parse, deparse and normalize SQL queries using the PostgreSQL query parser
This Ruby extension uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parsetree.
In addition the extension allows you to normalize queries (replacing constant values with ?) and parse these normalized queries into a parsetree again.
When you build this extension, it builds parts of the PostgreSQL server source (see libpg_query), and then statically links it into this extension.
This is slightly crazy, but is the only reliable way of parsing all valid PostgreSQL queries.
You can find further examples and a longer rationale here: https://pganalyze.com/blog/parse-postgresql-queries-in-ruby.html
gem install pg_query
Due to compiling parts of PostgreSQL, installation might take a while on slower systems. Expect up to 5 minutes.
=> #<:parserresult:0x00007fb69a958820 version: stmts: stmt: select_stmt: distinct_clause: target_list: res_target: name: indirection: val: a_const: integer: ival:> >, location: 7 > >, location: 7 > > ], from_clause: , group_clause: , window_clause: , values_lists: , sort_clause: , limit_option: :LIMIT_OPTION_DEFAULT, locking_clause: , op: :SETOP_NONE, all: false > >, stmt_location: 0, stmt_len: 0 > ] >, @warnings=> </:parserresult:0x00007fb69a958820>
parsed_query = PgQuery.parse("SELECT * FROM users")
Modify the parse tree in some way
parsed_query.tree.stmts.stmt.select_stmt.from_clause.range_var.relname = 'other_users'
Turn it into SQL again
parsed_query.deparse => "SELECT * FROM other_users"
# Normalizing a query (like pg_stat_statements in Postgres 10+) PgQuery.normalize("SELECT 1 FROM x WHERE y = 'foo'")
=> "SELECT $1 FROM x WHERE y = $2"
Parsing a normalized query (pre-Postgres 10 style)
PgQuery.parse("SELECT ? FROM x WHERE y = ?")
=> #<:parserresult:0x00007fb69a97a5d8 from x where y='?",' ...>, @warnings=> </:parserresult:0x00007fb69a97a5d8>
PgQuery.parse("SELECT ? FROM x JOIN y USING (id) WHERE z = ?").tables
=> ["x", "y"]
PgQuery.parse("SELECT ? FROM x WHERE x.y = ? AND z = ?").filter_columns
=> [["x", "y"], [nil, "z"]]
PgQuery.parse("SELECT 2; --- comment").fingerprint
Faster fingerprint method that is implemented inside the native C library
This gem is based on libpg_query, which uses the latest stable PostgreSQL version, but with a patch applied to support parsing normalized queries containing
Currently tested and officially supported Ruby versions:
pg_queryrelies on a C extension, which is discouraged / not properly supported for JRuby
pg_queryuses a copy of the Postgres parser & error handling code)
In order to update to a newer Postgres parser, first update libpg_query to the new Postgres version and tag a release.
Once that is done, follow the following steps:
rake update_sourceto update the source code
Rakefileand the modified files in
ext/pg_queryto this source tree and make a PR
See libpg_query for pgquery in other languages, as well as products/tools built on pgquery.
Copyright (c) 2015, pganalyze Team [email protected]
pg_query is licensed under the 3-clause BSD license, see LICENSE file for details.
Query normalization code:
Copyright (c) 2008-2015, PostgreSQL Global Development Group