TPC-DS benchmark kit with some modifications/fixes
The official TPC-DS tools can be found at tpc.org.
This version is based on v2.10.0 and has been modified to:
s_web_returnscolumn
wret_web_site_idto
wret_web_page_idto match specification. See #22 & #42.
To see all modifications, diff the files in the master branch to the version branch. Eg:
mastervs
v2.10.0.
Make sure the required development tools are installed:
Ubuntu:
sudo apt-get install gcc make flex bison byacc git
CentOS/RHEL:
sudo yum install gcc make flex bison byacc git
Then run the following commands to clone the repo and build the tools:
git clone https://github.com/gregrahn/tpcds-kit.git cd tpcds-kit/tools make OS=LINUX
Make sure the required development tools are installed:
xcode-select --install
Then run the following commands to clone the repo and build the tools:
git clone https://github.com/gregrahn/tpcds-kit.git cd tpcds-kit/tools make OS=MACOS
Data generation is done via
dsdgen. See
dsdgen -helpfor all options. If you do not run
dsdgenfrom the
tools/directory then you will need to use the option
-DISTRIBUTIONS /.../tpcds-kit/tools/tpcds.idx. The output directory (specified via the
-DIRoption) must exist prior to running
dsdgen.
Query generation is done via
dsqgen. See
dsqgen -helpfor all options.
The following command can be used to generate all 99 queries in numerical order (
-QUALIFY) for the 10TB scale factor (
-SCALE) using the Netezza dialect template (
-DIALECT) with the output going to
/tmp/query_0.sql(
-OUTPUT_DIR).
dsqgen \ -DIRECTORY ../query_templates \ -INPUT ../query_templates/templates.lst \ -VERBOSE Y \ -QUALIFY Y \ -SCALE 10000 \ -DIALECT netezza \ -OUTPUT_DIR /tmp