Rate This Document
Findability
Accuracy
Completeness
Readability

Principles

In the traditional three-stage compiler design, the frontend parses source code, checks source code errors, and establishes an abstract syntax tree to generate an intermediate file (IR). The optimizer reorganizes and optimizes the logic of the IR. The backend generates machine code based on the operating environment and optimizes links. libc is used to link the static library and call the SO library during running.

Figure 1 Three-stage compiler architecture

GCC is a single executable program compiler. There is no clear boundary between the frontend, IR, and backend. GCC is strongly coupled and cannot be developed independently. During the compilation process, much information cannot be reused by other programs. Inheriting the traditional three-stage design, LLVM normalizes the input and output interfaces and data of the optimizer. That is, the frontends of different languages parse the data and generate IRs with the same syntax rules. After optimization, common code is output to different backends for generating object code. The object code running platforms are limited, the backends are relatively fixed, and the input format of the frontend is fixed. Therefore, LLVM has convenient integration capabilities in terms of developing a compiler for a new language. In addition, multiple frontends are used as development examples, which promotes the prosperity of the LLVM framework. Compared with GCC, LLVM has a faster compilation speed, better performance of the target program, and more friendly prompts on compilation errors.