Understanding Compiler Types: Compilation, Interpretation, JIT, and Hybrid Approaches
In the overview of the compiler design article, we explored why understanding compilers is essential for every developer. A good foundation in compiler design techniques helps build a deeper understanding of how programming languages work. In this article, we will discuss the different types of compilers.
Broadly, there are two ways a programming language can be translated:
- Compilation
- Interpretation
Over time, advancements in compiler design have led to approaches that combine both techniques:
- Just-In-Time (JIT) Compilation
- Hybrid Compilation
Compilers
In compilation, a compiler reads the entire source code and directly converts it into machine code. For example, when you write a C program, the C compiler generates an executable file as output. This executable file can then be run directly on the machine where it was compiled or on another machine with the same architecture. However, if you try to run this executable on a completely different system, it may not work due to differences in the environment’s architecture (e.g., types of registers, instruction sets, register sizes, etc.).
In reality, this compilation occurs in different steps as discussed earlier. These steps on a high level are Lexer, Scanner, Optimization, Code Generation, Assemply and Linking.
flowchart LR
A[/Source Code/] --> B[Lexical Analysis]
B[Lexical Analysis] --> C[Scanner]
C[Scanner] --> D[Optimization]
D[Optimization] --> E[Code Generation]
E[Code Generation] --> F[Assemply, Linking]
F[Assemply, Linking] --> G([Machine Code])
These steps can be divided again into roughly two parts - frontend and backend. Backend part has a high coupling to the machine artchitecture and this is the part where machine code is generated based on the hardware.
flowchart LR
A[/Source Code/] --> B[Frontend]
B[Frontend] --> C[Backend]
C[Backend] --> D([Machine Code])
Frontend is more loosely coupled phase, this doesn’t have to deal with particular hardware but the main purpose of this phase to generate intermediate stuff that would help the Backend to efficiently and quickly produce the machine code. This is the phase where we get artifacts like tokenization, symbol table, syntax tree etc.
Interpreters
On the other hand, Interpreted languages are not tied to any platform but are generally slower than what compiled languages have to offer. Interpreter tends to divert from their counterpart i.e. mainly after the optimization phase. Till this phase, we have already computed the syntax tree and we have already established the syntactic correctness. From here interpreter just start to get the instructions from the syntax tree and/or from optimized code and run these instructions one by one.
Why Interpreters Are Slower
Since interpreters execute instructions on the fly instead of compiling them beforehand, they are generally slower than compiled programs. However, they come with key advantages:
- Platform independence
- Suitable for scripting environments
Since instructions are not compiled into machine code but instead executed line by line, interpreted languages offer greater platform independence. Most interpreted languages are well-suited for scripting, meaning they can be executed within a runtime host environment.
For example, JavaScript runs in the browser with the help of a host environment.
Just-In-Time Compilation (JIT)
Just-in-Time (JIT) compilation, sometimes referred to as dynamic compilation or runtime compilation, improves interpreter performance by compiling parts of the program at runtime.
A JIT compiler continuously monitors the bytecode, identifies frequently executed code blocks, compiles them into machine code, and caches them for future execution. This approach helps optimize performance by reducing redundant interpretation.
PHP 8.0 introduced JIT compilation in November 2020, significantly boosting execution speed. V8 (JavaScript Runtime) also leverages JIT compilation. The Ignition interpreter converts JavaScript code into bytecode. TurboFan, another component in V8, analyzes the bytecode and optimizes execution by compiling frequently used parts of the code into machine code.
Hybrid Compilation
High-level translators often combine compiler and interpreter features to achieve fast, efficient, and robust execution while maintaining platform independence.
An example of such a system is Java:
- Java Compiler (Javac) compiles Java source code into an intermediate representation called bytecode.
- The Java Virtual Machine (JVM) then interprets the bytecode and converts it into machine code at runtime.
- The JVM also employs JIT compilation to optimize frequently executed bytecode for better performance.
In this article, we discussed the differences bitween compilers and interpreters. We begin by looking at traditional compilation process and discussed about the benefits of compiling and interpreting. Additionally, we covered Just-In-Time compilation, a technique that improves the performance of interpreters by compiling frequently used code from bytecode, and we also discussed about hybrid model of compiling. Understanding these compiler types helps developers make informed decisions when choosing tools and techniques for software development.