Rate This Document
Findability
Accuracy
Completeness
Readability

Basic Branch Judgment

A branch structure is the basic, practical, and frequent module in a program, that is, it selects different program forward paths according to the judgment condition result to realize different function logics.

  • B.cond series

    In the Arm64 assembly instructions, a CPSR register is called a program status register with four flags: N, Z, C, and V. The Arm assembly implements the function of the branch structure through the flags and judgment. The basic usage is as follows:

    CMP/CCMP/ADDS/SUBS/ANDS/TST… … // Instructions for rewriting the status register

    B.cond label // Condition status judgment. If the condition is met, go to the label address.

    B is a jump instruction and cond is a condition code suffix of the jump instruction. Different flags indicate different conditions that need to be met by the jump, as shown in Table 1. If B is not followed by the cond flag, it indicates that the system jumps to the label address unconditionally.

    Table 1 Status flags

    ID

    mnemonic

    Description

    Mark

    0000

    EQ

    Equal to: 1

    Z==1

    0001

    NE

    Not equal to: 0

    Z==0

    0010

    HS/CS

    Unsigned higher or same carry. Carry: 1

    C==1

    0011

    LO/CC

    Unsigned lower carry clear. Borrow: 0

    C==0

    0100

    MI

    Negative: 1

    N==1

    0101

    PL

    Positive or zero: 0

    N==0

    0110

    VS

    Signed overflow: 1

    V==1

    0111

    VC

    No signed overflow: 0

    V==0

    1000

    HI

    Unsigned >

    C==1 && Z==0

    1001

    LS

    Unsigned <=

    !( C==1 && Z==0)

    1010

    GE

    Signed >=

    N==V

    1011

    LT

    With sign <

    N!=V

    1100

    GT

    With sign >

    Z==0 && N==V

    1101

    LE

    With sign <=

    !( Z==0 && N==V)

    1110

    AL

    Always executed

    Any

    1111

    NV

    The following table shows how to use the Arm64 assembly to replace the common branch structure of the C program implementation.

    Condition Type

    C Program Implementation

    Assembly Implementation

    Single-condition judgment

    /* Assumptions: Cond A: a > b */

    If (Cond A){

    Part 1

    } else {

    Part 2

    }

    End

    /* Assume the values of a and b are stored in registers x1 and x2. Note that value of b can be a variable loaded from its address or a constant. */

    ldr x1, [a_addr]

    ldr x2, [b_addr]

    cmp x1, x2

    b.le Part2

    (Part1:)

    ... ...

    b End

    Part2:

    ... ...

    End:

    ... ...

    Dual-condition judgment

    /* Assumptions: Cond A: a > b \

    Cond B: a > c */

    If (Cond A && Cond B){

    Part 1

    } else {

    Part 2

    }

    End

    /* Assume the values of a, b and c are stored in registers x1, x2 and x3. */

    ldr x1, [a_addr]

    ldr x2, [b_addr]

    ldr x3, [c_addr]

    cmp x1, x2

    ccmp x1, x3, 0, hi

    b.hi Part1

    Part2:

    ... ...

    b End

    Part1:

    ... ...

    End

  • Special judgment instructions

    B.cond can meet all judgment logics. However, sometimes, Arm64 provides some instructions for simple judgment and special judgment, as shown in the following table.

    Instruction

    Example

    Description

    CBZ

    CBZ X1, label

    If x1 == 0, go to label.

    CBNZ

    CBNZ X1, label

    If x1 != 0, go to label.

    TBZ

    TBZ X1, #3, label

    If the third bit of the x1 register is 0, go to label.

    TBNZ

    TBNZ X1, #3, label

    If the third bit of the x1 register is not 0, go to label.

    Let's look at the application scenarios of TBZ and TBNZ. The instruction function describes only whether a bit of a value is equal to 0. With the minimum unit of data stored by a computer, the relationship between the value and the 2n value can be further determined.

    C Program Implementation

    Assembly Implementation

    /* No assumption */

    if (x1 >= 64) {

    part1

    } else if (x1 >= 32) {

    part2

    } else if (x1 >= 16) {

    part3

    } else {

    part 4

    }

    tbnz x1, #4, Part3 /* over16 */

    ... ...

    b end

    Part3:

    tbnz x1, #5, Part2 /* over32 */

    ... ...

    b end

    Part2:

    tbnz x1, #6, Part1 /* over64 */

    ... ...

    b end

    Part1:

    ... ...

    end:

    ... ...

    /* Assumption: x1 < 128 */

    if (x1 >= 64) {

    part1

    } else if (x1 >= 32) {

    part2

    } else if (x1 >= 16) {

    part3

    } else {

    part 4

    }

    /* Ensure the x1 is less than 128, which means the bit value higher that 6th bit equals 0. */

    cmp x1, 128

    b.ge end

    tbz x1, #6, Part2 /* 64less */

    ... ...

    b end

    Part2:

    tbz x1, #5, Part3 /* 32less */

    ... ...

    b end

    Part3:

    tbz x1, #4, Part4 /* 16less */

    ... ...

    b end

    Part4:

    ... ...

    end:

    ... ...

    As shown in the foregoing code, TBNZ may be used to continuously determine a branch whose entry condition value is greater than or equal to 2n, and TBZ may be used to continuously determine a branch whose entry condition value is less than 2n. It should be noted that, when TBZ determines that a bit is 0, it does not mean that the value is necessarily less than 2n, and there may be a 1 in a high-order bit, so that the entire value is greater. Therefore, when TBZ is used for judgment, the range of the condition value should be limited to be less than 2n+1 first.

  • csel: branch result selection instruction

    The branch structure is added because different processes need to be processed in different cases. Generally, a judgment instruction and a jump instruction are introduced, which may affect pipeline smoothness. However, when the process is relatively simple and ends by assigning values to some registers, the csel instruction may be considered to avoid branch jump. The simplest process is similar to a ternary operator, as shown in the following table.

    C Program Implementation

    Assembly Implementation

    int a = 3;

    int b = 4;

    int c = (a > b) ? a : b;

    /* Assume the values of a and b stored in addr_a and addr_b, and the values of c will be stored in addr_c*/

    ... ...

    ldr vala, [addr_a]

    ldr valb, [addr_b]

    cmp vala, valb

    csel valc, vala, valb, ge

    str valc, [addr_c]

    ... ...