鲲鹏社区首页
中文
注册
我要评分
文档获取效率
文档正确性
内容完整性
文档易理解
在线提单
论坛求助

执行openLooKeng引擎业务

openLooKeng使用CLI来执行SQL任务。

本次任务示例使用tpch的1T数据的非分区表作为测试表,测试SQL为tpch-sql6。

相关的表信息如表1所示。

表1 相关表信息

表名

表格式

总数

占用空间

lineitem

orc

5999989709

169.6 G

具体步骤如下。

  1. 运行openLooKeng CLI命令。
    1
    2
    3
    4
    cd bin/
    ./hetu-cli –server coordinatorIp:port –catalog omnidata
    set session omni_data_enabled=true;
    use hive.tpch_flat_orc_date_1000;
    

    通过explain可以看到执行计划中,已经将算子下推到Scan阶段。

    Output[revenue] 
    │ Layout: [sum:double] 
    │ Estimates: {rows: 4859991664 (40.74GB), cpu: 246.43G, memory: 86.00GB, network: 45.26GB} 
    │ revenue := sum 
    └─ Aggregate(FINAL) 
    │ Layout: [sum:double] 
    │ Estimates: {rows: 4859991664 (40.74GB), cpu: 246.43G, memory: 86.00GB, network: 45.26GB} 
    │ sum := sum(sum_4) 
    └─ LocalExchange[SINGLE] () 
    │ Layout: [sum_4:double] 
    │ Estimates: {rows: 5399990738 (45.26GB), cpu: 201.17G, memory: 45.26GB, network: 45.26GB} 
    └─ RemoteExchange[GATHER] 
    │ Layout: [sum_4:double] 
    │ Estimates: {rows: 5399990738 (45.26GB), cpu: 201.17G, memory: 45.26GB, network: 45.26GB} 
    └─ Aggregate(PARTIAL) 
    │ Layout: [sum_4:double] 
    │ Estimates: {rows: 5399990738 (45.26GB), cpu: 201.17G, memory: 45.26GB, network: 0B} 
    │ sum_4 := sum(expr) 
    └─ ScanProject[table = hive:tpch_flat_orc_date_1000:lineitem offload={ filter=[AND(AND(BETWEEN(l_discount, 0.05, 0.07), LESS_THAN(l_quantity, 25.0)), AND(GREATER_THAN_OR_EQUAL(l_shipdate, 8401), LESS_THAN(l_shipdate, 8766)))]} ] 
     Layout: [expr:double] 
     Estimates: {rows: 5999989709 (50.29GB), cpu: 100.58G, memory: 0B, network: 0B}/{rows: 5999989709 (50.29GB), cpu: 150.87G, memory: 0B, network: 0B} 
     expr := (l_extendedprice) * (l_discount) 
     l_extendedprice := l_extendedprice:double:5:REGULAR 
     l_discount := l_discount:double:6:REGULAR
  2. 执行sql6。
    1
    2
    3
    4
    5
    6
    7
    8
    9
    select
    sum(l_extendedprice * l_discount) as revenue
    from
    tpch_flat_orc_1000.lineitem
    where
    l_shipdate >= '1993-01-01'
    and l_shipdate < '1994-01-01'
    and l_discount between 0.06 - 0.01 and 0.06 + 0.01
    and l_quantity < 25;
    
    • 下推特性关闭时,执行时间13s。

    • 下推特性开启时,执行时间6s。

    执行任务时会打印下推的信息,如下所示,包含了下推的选择率以及算子信息。

    2021-09-04T20:39:50.803+0800 INFO Query-20210904_123950_00015_ibj42-1839 io.prestosql.plugin.hive.rule.HiveFilterPushdown Offloading: table lineitem, size[5999989709], predicate[AND(AND(BETWEEN(l_discount, 0.05, 0.07), LESS_THAN(l_quantity, 25.0)), AND(GREATER_THAN_OR_EQUAL(l_shipdate, 8401), LESS_THAN(l_shipdate, 8766)))], filter factor[90.00%].
     2021-09-04T20:39:53.415+0800 INFO dispatcher-query-17 io.prestosql.event.QueryMonitor TIMELINE: Query 20210904_123950_00015_ibj42 :: Transaction:[9b817dfa-xxxx-xxxx-xxxx-xxxxxbb805ea] :: elapsed 2636ms :: planning 43ms :: waiting 3ms :: scheduling 27ms :: running 2436ms :: finishing 130ms :: begin 2021-09-04T20:39:50.768+08:00 :: end 2021-09-04T20:39:53.404+08:00