选项 -split-ldp-stp
说明
识别某些性能表现差的 ldp/stp,将其拆分成2个 ldr 和 str。
使用方法
使用-fsplit-ldp-stp选项使能优化。
- 使用参数--param=param-ldp-dependency-search-range=[1,32]控制搜索范围,默认16。
注:依赖-O1及以上优化等级。
结果
测试用例如下:
int __RTL (startwith ("split_complex_instructions")) simple_ldp_after_store () { (function "simple_ldp_after_store" (insn-chain (block 2 (edge-from entry (flags "FALLTHRU")) (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) (cinsn 228 (set (reg/i:DI sp) (reg/i:DI x0))) (cinsn 238 (set (reg/i:DI x1) (reg/i:DI x0))) (cinsn 101 (set (mem/c:DI (plus:DI (reg/f:DI sp) (const_int 8))[1 S4 A32])(reg:DI x0))) (cinsn 10 (parallel [ (set (reg:DI x29) (mem:DI (plus:DI (reg/f:DI sp) (const_int 8)) [1 S4 A32])) (set (reg:DI x30) (mem:DI (plus:DI (reg/f:DI sp) (const_int 16)) [1 S4 A32]))])) (cinsn 102 (set (mem/c:DI (plus:DI (reg/f:DI x1) (const_int -16)) [1 S4 A32]) (reg:DI x0))) (cinsn 11 (parallel [ (set (reg:DI x3) (mem:DI (plus:DI (reg/f:DI x1) (const_int -16)) [1 S4 A32])) (set (reg:DI x4) (mem:DI (plus:DI (reg/f:DI x1) (const_int -8)) [1 S4 A32])) ])) (cinsn 103 (set (mem/c:DI (reg/f:DI x1) [1 S4 A32]) (reg:DI x0))) (cinsn 12 (parallel [ (set (reg:DI x5) (mem:DI (reg/f:DI x1) [1 S4 A32])) (set (reg:DI x6) (mem:DI (plus:DI (reg/f:DI x1) (const_int 8)) [1 S4 A32])) ])) (cinsn 13 (use (reg/i:DI sp))) (cinsn 14 (use (reg/i:DI cc))) (cinsn 15 (use (reg/i:DI x29))) (cinsn 16 (use (reg/i:DI x30))) (cinsn 17 (use (reg/i:DI x0))) (cinsn 18 (use (reg/i:DI x3))) (cinsn 19 (use (reg/i:DI x4))) (cinsn 20 (use (reg/i:DI x5))) (cinsn 21 (use (reg/i:DI x6))) (edge-to exit (flags "FALLTHRU")) ) ;; block 2 ) ;; insn-chain ) ;; function "simple_ldp_after_store" }
测试命令:
gcc -O1 -fsplit-ldp-stp -S test.c -o test.s
图1 选项未打开

图2 选项已经打开

相比选项未打开时,选项打开后,生成的汇编代码指令不存在ldp指令,而是拆分为两个ldr指令。
父主题: 静态编译优化