Understanding V8's Architecture and Optimizing for the V8 JavaScript Engine
Image from pixabay.com
The V8 engine is the open-source JavaScript engine that powers Google Chrome and Node.js. Written in C++, it essentially translates your JavaScript code into machine code, which your computer can then understand and execute rapidly. To optimize your code for V8, you’d generally want to write code that allows V8 to perform optimizations effectively.
This article will explore how V8 works and what considerations are important when writing JavaScript code for optimal V8 performance.
Two Distinct Parts of V8
V8 employs a two-tiered compilation strategy. Initially, the JavaScript source code is parsed into an Abstract Syntax Tree (AST). This AST is then fed into Ignition, V8’s interpreter, which generates bytecode. The bytecode is executed, and during execution, Ignition collects profiling data (e.g., function call counts, type feedback).
Concurrently, or shortly thereafter, TurboFan, V8’s optimizing compiler, leverages this profiling data to perform speculative optimization. Based on the observed types and control flow, TurboFan generates highly optimized machine code.
Ignition (The Interpreter):
Think of Ignition as a quick and reliable translator. It takes your JavaScript code and quickly converts it into an intermediate bytecode. This bytecode is then executed. Ignition gets the code running fast initially.
TurboFan (The Optimizing Compiler):
TurboFan notices that certain parts of your code are being run frequently, it kicks in to analyze these “hot” sections. It then takes this bytecode and aggressively optimizes it, generating highly efficient machine code. This optimized code replaces the interpreter’s work for those hot sections, making your program run much faster over time. Some of the techniques used to optimize code are:
- Inline Caching: Caching the results of property lookups and method calls based on the receiver object’s Hidden Class (we’ll get to this next).
- Control Flow Graph (CFG) and Static Single Assignment (SSA) representation: Enabling advanced optimizations like dead code elimination, loop unrolling, and instruction scheduling.
- Type Feedback: Using runtime type information to specialize generated machine code for specific types, leading to faster operations.
If the assumptions made during TurboFan’s optimization become invalid (e.g., due to a change in object structure or data types), the optimized code is deoptimized, and execution falls back to the interpreter.
Visualization of V8 Compilation Process.
flowchart TD
A[("JavaScript Code")] --> B["Parsing (Parser)"]
B --> C["Abstract Syntax Tree (AST)"]
C --> D["Interpreter (Ignition)"]
D --> E["Bytecode Execution"]
D --> F["Profiling (Runtime Data)"]
F --> G["Optimizing Compiler (Turbofan)"]
G --> H["Optimized Machine Code"]
H --> I["Execution (CPU)"]
E --> I
I --> J{Is Optimization Valid?}
J -- No --> K["Deoptimization (Fallback to Bytecode)"]
J -- Yes --> I
%% Styling
classDef default fill:#2a4365,stroke:#90cdf4,color:#fff,stroke-width:2px;
classDef parser fill:#4c51bf,stroke:#b2f5ea;
classDef interpreter fill:#9f7aea,stroke:#fbb6ce;
classDef turbofan fill:#dd6b20,stroke:#fbd38d;
classDef bytecode fill:#38b2ac,stroke:#b2f5ea;
classDef optimized fill:#c53030,stroke:#fed7d7;
class B parser;
class D interpreter;
class G turbofan;
class E bytecode;
class H optimized;
%% Annotations
note1["📜 **Parsing**: Converts JS code into an AST"]:::noteStyle
note2["⚡ **Ignition**: Generates & executes bytecode"]:::noteStyle
note3["🚀 **Turbofan**: Optimizes hot code into machine code"]:::noteStyle
note4["🔄 Deoptimizes if assumptions fail"]:::noteStyle
%% Notes styling
classDef noteStyle fill:#1a365d,stroke:#4299e1,color:#fff,stroke-width:1px;
%% Link notes to nodes
note1 -.- B
note2 -.- D
note3 -.- G
note4 -.- K
Key Phases within V8
To elaborate on the compilation process, we can think of the following key phases within V8:
Parsing:
The JavaScript source code is read and parsed into an Abstract Syntax Tree (AST). This tree represents the structure of your code.
Compilation to Bytecode (Ignition):
The AST is then translated into bytecode, a more compact and machine-readable intermediate representation. This is done by the Ignition interpreter.
Execution (Ignition):
The bytecode is executed by the Ignition interpreter. During this phase, V8 collects runtime information, such as the types of variables and the flow of execution.
Optimization (TurboFan - Triggered):
Based on the profiling data gathered by Ignition, V8 identifies “hot” functions (code that is executed frequently). These hot functions are then targeted for optimization by the TurboFan compiler.
Compilation to Machine Code (TurboFan):
TurboFan takes the bytecode (and the collected type feedback) and generates highly optimized machine code specific to the architecture of your computer.
Machine Code Execution:
Subsequent executions of the optimized functions use the generated machine code, leading to significant performance improvements.
Deoptimization (Occasional):
If the assumptions made by TurboFan during optimization turn out to be incorrect (e.g., a variable’s type changes unexpectedly), V8 can “deoptimize” the code, discarding the optimized machine code and falling back to executing the bytecode with Ignition.
Hidden Classes - Core for Optimization
In dynamically typed languages like JavaScript, the structure and types of an object can change during runtime. This becomes a challenge for optimization because type, memory address, variable address resolution etc. can change causing invalidation of the existing machine code. How can the engine efficiently access properties if it doesn’t know the object’s layout beforehand?
V8 tackles this with the concept of Hidden Classes (sometimes also referred to as “Shapes” or “Maps”). When you create an object with the same set of properties in the same order, V8 internally associates these objects with the same hidden class.
V8 tracks the structure of your objects. When objects share the same hidden class, V8 can make assumptions about the memory location (memory offset) of their properties. This allows for much faster property access because the engine doesn’t have to perform a dynamic lookup every time. It knows exactly where to go in memory to find a specific property.
To illustrate V8’s hidden classes, let’s take an example constructor of Point:
function Point(x, y) {
this.x = x;
this.y = y;
}
const p1 = new Point(1, 2);
Let’s understand the above code and creation of hidden classes.
- When control reaches
const p1 = new Point(1, 2);
, V8 creates an initial hidden class forp1
, let’s call itHC0
. - When
this.x = x;
is executed, V8 transitionsp1
to a new hidden class, sayHC1
.HC1
describes an object with propertyx
at a specific memory offset (e.g., offset 0).HC1
also records that it transitioned fromHC0
by adding property x. - When
this.y = y;
is executed,p1
transitions again to a new hidden class,HC2
.HC2
describes an object with propertyx
at offset 0 and propertyy
at offset 4 (assuming a 4-byte value for x).HC2
also records that it transitioned fromHC1
by adding propertyy
.
Now, when the second object p2 is created:
const p2 = new Point(3, 4);
The same sequence of operations occurs. Since the properties x
and y
are added in the same order, p2
will also transition through HC0 → HC1 → HC2
, ending up with the same hidden class HC2
as p1
.
Because p1 and p2 share the same hidden class (HC2), V8 knows the exact memory layout of these objects. When you access p1.x
or p2.x
, the engine can directly go to the memory offset associated with x
in HC2
without having to perform a costly dynamic lookup.
Now assume, if the properties were added in to different order or changed later, hidden class will not remain same and we will lost the optimization. For example:
const p3 = {};
p3.x = 4;
p3.y = 5;
const p4 = {};
p3.y = 5;
p3.x = 4;
Transistion process for p3
would be HC0 (empty) -> HC1 (has x) -> HC2(has x and y)
.
Transistion process for p4
would be HC0 (empty) -> HC1 (has y) -> HC2(has y and x)
.
Even though they end up with the same properties, the different transition paths and potentially different final hidden classes prevent the same level of optimization for property access.
This concept of hidden classes and the transitions between them is fundamental to how V8 optimizes dynamic property access.
Inline Cache
Inline Caching is an optimization technique that V8 uses to speed up repeated calls to the same operations on objects of the same hidden class. It leverages the information learned about object structure through hidden classes.
In V8, when a piece of code accesses a property of an object for the first time, the generated machine code not only performs the access but also “remembers” the hidden class of the object and the memory offset of the property. This “remembering” is the inline cache.
The next time the same code tries to access a property on an object with the same hidden class, V8 can skip the property lookup process and directly access the value at the cached memory offset. This significantly speeds up subsequent property accesses.
However, if the hidden class of the object changes (e.g., a new property is added or the order of properties changes), the inline cache becomes invalid, and V8 has to go through the property lookup process again. This can lead to performance degradation.
function getX(point) {
return point.x;
}
const p1 = { x: 10, y: 20 };
const p2 = { x: 30, z: 40 };
getX(p1); // V8 learns the hidden class of p1 and the offset of 'x'
getX(p1); // Subsequent calls to getX with p1 are much faster due to inline caching
getX(p2); // p2 has a different hidden class (different properties).
// This might lead to a cache miss and V8 having to learn a new structure.
In the getX function, after the first call with p1, V8’s optimized code for getX will likely have an inline cache that assumes the input object has the hidden class of p1 and that the property x is at a specific memory location. When getX is called again with p1 (which has the same hidden class), this cached information can be used directly. However, when getX is called with p2 (which has a different hidden class because it has z instead of y), the inline cache might miss, and V8 might need to adapt or create a new inline cache.
Understanding hidden classes is crucial because inline caching relies heavily on the consistency of these hidden classes.
V8’s Garabage Collector: Young and Old Generation
V8 uses two types of GC algorithms: the scavenging algorithm for young generation and the mark-sweep-compact algorithm for the old generation:
Scavenging Algorithm (Young Generation):
Imagine the young generation space as a temporary playground for newly created toys (objects). This playground has two identical sections: the “from-space” where the toys are currently being used, and the “to-space” which is initially empty.
-
Allocation: New toys are placed in the from-space as they are created. Allocation here is very fast, like just dropping a new toy in an open area.
-
Collection Trigger: When the from-space becomes full, a garbage collection cycle is triggered. This is a stop-the-world event, meaning the JavaScript execution pauses briefly.
-
Identifying Live Objects: The garbage collector looks at all the toys in the from-space and determines which ones are still being held or are reachable from the main toy storage (the roots). These are the “live” objects.
-
Copying Live Objects: The live toys are then carefully moved from the from-space to the empty to-space. During this process, they might also be “promoted” to the old generation if they have survived a certain number of these scavenging cycles.
-
Space Flipping: Once all the live objects have been moved, the from-space is now considered completely empty. The roles of the two spaces are then flipped: the “to-space” becomes the new from-space for the next round of allocations, and the old from-space becomes the new to-space.
This process is very efficient for collecting a large number of short-lived objects because it only involves copying the live objects, and the dead objects are simply left behind in the from-space.
Mark-Sweep-Compact Algorithm (Old Generation):
The old generation space is where toys that have lasted a long time reside. Since these objects are more persistent, a different strategy is needed for garbage collection.
-
Mark Phase: The garbage collector traverses the entire object graph starting from the roots and “marks” all the objects that are still reachable. This is like putting a sticker on every toy that is still being played with or is connected to a toy that is being played with.
-
Sweep Phase: After marking, the collector goes through the entire old generation space. All the unmarked objects are considered “dead,” and the memory they occupy is freed up. This is like identifying all the toys without stickers and putting them in a discard pile. This phase can lead to memory fragmentation, where there are many small, unusable gaps of free memory.
-
Compact Phase (Optional but often performed): To address fragmentation, a compaction phase might be performed. This involves moving the live (marked) objects together in memory, effectively defragmenting the space and creating larger contiguous blocks of free memory. This is like rearranging the remaining toys to make more space for new ones. However, this phase can be more time-consuming as it involves moving objects and updating pointers to them.
Optimizing JavaScript for V8
Optimizing JavaScript for V8 (the engine behind Chrome, Node.js, and Edge) requires understanding how V8 compiles and executes code. Here are key optimization tips to make your code run faster:
- Favor Monomorphic Code (Single Shape or Hidden Class): As we studied above, V8 optimizes objects with consistent hidden classes. Adding/deleting properties dynamically creates new hidden classes, forcing V8 to deoptimize and recompile code. Avoid dynamically adding/deleting properties:
Good example:
function Point(x, y) {
this.x = x; // Consistent shape
this.y = y;
}
const p1 = new Point(1, 2);
Bad example:
const obj = {};
obj.x = 1; // Shape A
obj.y = 2; // Shape A
obj.z = 3; // Shape B (slower)
- Use Typed Arrays for Numeric Work:
Favor typed arrays when you need to do some numeric work. For example: Int32Array are faster then regular arrays. Regular JavaScript arrays can hold any type (numbers, strings, objects), forcing V8 to handle them as polymorphic. Typed Arrays (Int32Array, Float64Array) enforce a single type, allowing denser memory storage and SIMD (Single Instruction Multiple Data)
- Avoid using Array Holes (Sparse Arrays): Sparse arrays force V8 to deoptimize. Prefer contiguous arrays. V8 optimizes arrays into two modes: 1. Fast Elements and 2. Dictionary mode. Holes in arrays forces V8 to switch to dictionary mode, which is 10-100x slower for iterations.
Good example:
const arr = new Array(3).fill(0);
Bad example:
const arr = [4, , 6];
-
Optimize Functions for Inlining Small, simple functions are inlined by TurboFan. Avoid using large functions, functions with try-catch. Large functions generate AST having nodes greater than 600 causes performance for inlining function. Try/catch requires complex control flow.
-
Avoid arguments and delete: Using
arguments
anddelete
forces V8 to disable optimizations. Favor using rest instead of arguments.arguments
is a magical object that breaks scope optimizations anddelete
will change the properties causing invalidation of inline cache and creates holes in arrays causing switching to dictionary mode.
Good example:
function sum(...nums) {
// Optimizable
return nums.reduce((a, b) => a + b);
}
Bad example:
function sum() {
return Array.from(arguments).reduce((a, b) => a + b); // Slower
}
- Prefer for-of or for over forEach:
V8 optimizes classic loops better than functional-style iterations. for
and for-of
are lower level looping statements and are easier for TurboFan to optimize. forEach
uses function callback per iteration causing call stacks, scope creation for each call stack frame and this may prevent inlining of functions by TurboFan.
- Use Constants for Object Keys: Reusing the same key strings improves hidden class sharing. V8 caches lookups based on string references and reusing the same key string shares hidden classes.
const KEY = "id"; // Reused
obj1[KEY] = 1; // Hidden Class C0
obj2[KEY] = 2; // Still class c0 (optimized)
-
Avoid Polymorphic Operations: Functions handling multiple types (e.g., numbers + strings) get deoptimized. TurboFan generates specialized machine code per type and mixing types forces generic deoptimized code. For example using same function to do addition and concatination based on types. It’s better to use separate function per type.
-
Prefer === Over == : Avoid type coercion
==
operator instead use===
operator for comparing equality.==
performs type coercion and it require extra checks while===
skips type coercion and is always faster.===
can be 2-5x faster in tight loops. -
Avoid eval and with: These disable scoping optimizations and are forbidden in strict mode. Dynamically scoped code disables lexical scope optimizations.
-
Leverage V8’s Built-in Methods: Native methods (e.g., Math.max, Array.prototype.map) are heavily optimized, written in C++ with JS overhead and are pre-optimized by TurboFan.