Importing Test Data to Test Performance
This section describes how to test performance after the Doris instruction optimization on the TPC-H test set.
- Download and install the TPC-H tool package.
- Copy the tpch-tools folder from the downloaded Doris source code to the /opt/tools/installed directory.
1cp -r /opt/tools/installed/doris-2.1.2-rc04/tools/tpch-tools /opt/tools/installed
- Go to the tpch-tools folder.
1cd /opt/tools/installed/tpch-tools
- Manually download the TPC-H dependency tool package, rename the tool package, and save the tool package to the specified directory.
mv Downloaded_package TPC-H_Tools_v3.0.0new.zip mv TPC-H_Tools_v3.0.0new.zip /opt/tools/installed/tpch-tools/bin
- Modify the build-tpch-dbgen.sh file and comment out the wget download content.
- Open the file.
vi bin/build-tpch-dbgen.sh
- Press i to enter the insert mode and modify the file as follows:
#wget "https://doris-build-1308700295.cos.ap-beijing.myqcloud.com/tools/TPC-H_Tools_v3.0.0new.zip"
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Open the file.
- Generate the dbgen binary file in the TPC-H_Tools_v3.0.0/ directory.
1sh bin/build-tpch-dbgen.sh
- Copy the tpch-tools folder from the downloaded Doris source code to the /opt/tools/installed directory.
- Modify the configuration file conf/doris-cluster.conf of the test tool.
- Open the configuration file.
1vi conf/doris-cluster.conf - Press i to enter the insert mode and modify the following content of the file:
1 2 3 4 5 6 7 8 9 10 11 12
# Any of FE host export FE_HOST='xx.xx.xx.xx' # http_port in fe.conf export FE_HTTP_PORT=8030 # query_port in fe.conf export FE_QUERY_PORT=9030 # Doris username export USER='root' # Doris password export PASSWORD='' # The database where TPC-H tables located export DB='tpch100G'
- FE_HOST indicates the FE IP address, which is usually the IP address of the local physical machine, for example, 172.18.0.11/21.
- FE_HTTP_PORT indicates the value of the http_port parameter of the FE. The value must be the same as that in fe.conf.
- FE_QUERY_PORT indicates the value of the query_port parameter of the FE. The value must be the same as that in fe.conf.
- USER indicates the username.
- PASSWORD indicates the password. If it is not configured, leave it blank.
- DB indicates the name of the database corresponding to TPC-H.
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Open the configuration file.
- Generate a TPC-H dataset.
1sh bin/gen-tpch-data.sh -s 100 -c 40
- -s indicates the size of the dataset, which can be set to 10, 500, or 1000, in GB.
- -c specifies the number of threads used to generate data in parallel.
- Generate TPC-H data tables.
1sh bin/create-tpch-tables.sh - Import data.
1sh bin/load-tpch-data.sh -c 40
- Run the test SQL statement to compare the performance of the open source doris_be and newly compiled doris_be.
1sh bin/run-tpch-queries.sh -s 100
Parent topic: Feature Usage

