Basic Branch Judgment
A branch structure is the basic, practical, and frequent module in a program, that is, it selects different program forward paths according to the judgment condition result to realize different function logics.
- B.cond series
In the Arm64 assembly instructions, a CPSR register is called a program status register with four flags: N, Z, C, and V. The Arm assembly implements the function of the branch structure through the flags and judgment. The basic usage is as follows:
CMP/CCMP/ADDS/SUBS/ANDS/TST… … // Instructions for rewriting the status register
B.cond label // Condition status judgment. If the condition is met, go to the label address.
B is a jump instruction and cond is a condition code suffix of the jump instruction. Different flags indicate different conditions that need to be met by the jump, as shown in Table 1. If B is not followed by the cond flag, it indicates that the system jumps to the label address unconditionally.
Table 1 Status flags ID
mnemonic
Description
Mark
0000
EQ
Equal to: 1
Z==1
0001
NE
Not equal to: 0
Z==0
0010
HS/CS
Unsigned higher or same carry. Carry: 1
C==1
0011
LO/CC
Unsigned lower carry clear. Borrow: 0
C==0
0100
MI
Negative: 1
N==1
0101
PL
Positive or zero: 0
N==0
0110
VS
Signed overflow: 1
V==1
0111
VC
No signed overflow: 0
V==0
1000
HI
Unsigned >
C==1 && Z==0
1001
LS
Unsigned <=
!( C==1 && Z==0)
1010
GE
Signed >=
N==V
1011
LT
With sign <
N!=V
1100
GT
With sign >
Z==0 && N==V
1101
LE
With sign <=
!( Z==0 && N==V)
1110
AL
Always executed
Any
1111
NV
The following table shows how to use the Arm64 assembly to replace the common branch structure of the C program implementation.
Condition Type
C Program Implementation
Assembly Implementation
Single-condition judgment
/* Assumptions: Cond A: a > b */
If (Cond A){
Part 1
} else {
Part 2
}
End
/* Assume the values of a and b are stored in registers x1 and x2. Note that value of b can be a variable loaded from its address or a constant. */
ldr x1, [a_addr]
ldr x2, [b_addr]
cmp x1, x2
b.le Part2
(Part1:)
... ...
b End
Part2:
... ...
End:
... ...
Dual-condition judgment
/* Assumptions: Cond A: a > b \
Cond B: a > c */
If (Cond A && Cond B){
Part 1
} else {
Part 2
}
End
/* Assume the values of a, b and c are stored in registers x1, x2 and x3. */
ldr x1, [a_addr]
ldr x2, [b_addr]
ldr x3, [c_addr]
cmp x1, x2
ccmp x1, x3, 0, hi
b.hi Part1
Part2:
... ...
b End
Part1:
... ...
End
- Special judgment instructions
B.cond can meet all judgment logics. However, sometimes, Arm64 provides some instructions for simple judgment and special judgment, as shown in the following table.
Instruction
Example
Description
CBZ
CBZ X1, label
If x1 == 0, go to label.
CBNZ
CBNZ X1, label
If x1 != 0, go to label.
TBZ
TBZ X1, #3, label
If the third bit of the x1 register is 0, go to label.
TBNZ
TBNZ X1, #3, label
If the third bit of the x1 register is not 0, go to label.
Let's look at the application scenarios of TBZ and TBNZ. The instruction function describes only whether a bit of a value is equal to 0. With the minimum unit of data stored by a computer, the relationship between the value and the 2n value can be further determined.
C Program Implementation
Assembly Implementation
/* No assumption */
if (x1 >= 64) {
part1
} else if (x1 >= 32) {
part2
} else if (x1 >= 16) {
part3
} else {
part 4
}
tbnz x1, #4, Part3 /* over16 */
... ...
b end
Part3:
tbnz x1, #5, Part2 /* over32 */
... ...
b end
Part2:
tbnz x1, #6, Part1 /* over64 */
... ...
b end
Part1:
... ...
end:
... ...
/* Assumption: x1 < 128 */
if (x1 >= 64) {
part1
} else if (x1 >= 32) {
part2
} else if (x1 >= 16) {
part3
} else {
part 4
}
/* Ensure the x1 is less than 128, which means the bit value higher that 6th bit equals 0. */
cmp x1, 128
b.ge end
tbz x1, #6, Part2 /* 64less */
... ...
b end
Part2:
tbz x1, #5, Part3 /* 32less */
... ...
b end
Part3:
tbz x1, #4, Part4 /* 16less */
... ...
b end
Part4:
... ...
end:
... ...
As shown in the foregoing code, TBNZ may be used to continuously determine a branch whose entry condition value is greater than or equal to 2n, and TBZ may be used to continuously determine a branch whose entry condition value is less than 2n. It should be noted that, when TBZ determines that a bit is 0, it does not mean that the value is necessarily less than 2n, and there may be a 1 in a high-order bit, so that the entire value is greater. Therefore, when TBZ is used for judgment, the range of the condition value should be limited to be less than 2n+1 first.
- csel: branch result selection instruction
The branch structure is added because different processes need to be processed in different cases. Generally, a judgment instruction and a jump instruction are introduced, which may affect pipeline smoothness. However, when the process is relatively simple and ends by assigning values to some registers, the csel instruction may be considered to avoid branch jump. The simplest process is similar to a ternary operator, as shown in the following table.
C Program Implementation
Assembly Implementation
int a = 3;
int b = 4;
int c = (a > b) ? a : b;
/* Assume the values of a and b stored in addr_a and addr_b, and the values of c will be stored in addr_c*/
... ...
ldr vala, [addr_a]
ldr valb, [addr_b]
cmp vala, valb
csel valc, vala, valb, ge
str valc, [addr_c]
... ...