How Node.js handles concurrency

How Node.js handles concurrency

Node.js is javascript runtime environment which provides the capability to run javascript on the server i.e. outside the browser. Javascript is single-threaded in nature, so to understand how it handles concurrency it is essential to understand certain key components and their functioning as explained below.

Key components

Screen Shot 2021-01-30 at 11.29.07 AM.png

As shown in the above diagram, the four main components in Node.js are Call stack, Event loop, Callback queue ( there are various types of callback queues which will be explained later ) and Node.js core APIs.

CALL STACK

Call stack as the word suggests is a stack data structure which works under the principle FILO (First-In Last-Out). As soon as a function starts executing it gets pushed on to the callstack and the function will be removed from the stack once its execution is completed. Let's understand this using below example.

function one() { two(); console.log(1); }
function two() { three(); console.log(2); }
function three() { console.log(3); }

one();

To find what the above code outputs it's best to visualize the callstack as shown in below diagram.

Screen Shot 2021-01-30 at 11.17.32 AM.png

When a function starts executing is when the function gets moved in to the call stack, so the first function that gets moved in to the stack is function one which starts executing but the first line in function one is a call to function two, so now function two gets moved in to call stack which immediately calls function three which gets moved on top of the stack. Whichever function is at the top of the stack is the function which is currently in execution, so 3 gets printed and there is nothing else inside function three so function three execution is completed and gets removed from the stack, next function two is at the top of the stack which prints 2 and pops out of the stack and finally 1 gets printed and function one pops out of the stack and stack gets empty. The final output is 3 2 1.

The single-threaded nature of javascript is shown through call stack in which only the function at the top of the stack gets executed and any other code to execute in parallel has to wait until the top function execution is completed, so to handle concurrency node.js relies on other components.

CONCURRENCY

Let's understand how concurrency is handled through below example.

//asynchronous function
setTimeout(() => {
    console.log(0)
}, 1000);
function one() { console.log(1); }
console.log(2);
one();

In the first line setTimeout function gets executed so it gets pushed in to call stack but this is an asynchronous function whose execution is handled by timer module in Node.js core API, so it gets handed over to Node.js timer module and pops out of the call stack immediately, now the execution of our code continues and prints 2 since this console.log(2) is not wrapped in any function it just gets executed in the global context and then the next function that gets called is one() which gets pushed in to the call stack which prints 1 and pops out of the call stack. The setTimeout function which is executed by Node.js timer core API waits for 1 second(1000 ms) and then after 1 second it hands over the callback function with console.log(0) to callback queue, in this case timer queue. Now for this callback function to execute it needs to be pushed back to call stack, this is where event loop comes in to picture. The job of event loop is to continuously keep checking if the callstack is empty or not, only when the callstack gets empty it will push the callback function waiting in the timer queue back to callstack. Finally once the callback function moves to callstack it gets executed and prints 0 and pops out of the callstack. So the final output will be 2 1 0.

In the above example the only asynchronous function is the setTimeout function, but in Node.js there are various core APIs which handle different type of asynchronous functions, to understand these we need to understand different queues and their phases in Node.js eventloop execution.

EVENT LOOP PHASES

Screen Shot 2021-01-31 at 11.26.25 AM.png

Theses phases run as shown in the above order.

Poll

This is the first phase that gets executed, the I/O callbacks gets executed in this phase.

Check

This is the second phase in which SetImmediate() callbacks gets executed.

Close callbacks

The callbacks related to EventEmitter close events gets executed in this third phase.

Timers

Callbacks related to timers, setTimeout() and SetInterval() gets executed in this phase.

Pending callbacks

Callbacks related to some system operations such as TCP errors executed in this phase. Eg:- ECONNREFUSED during TCP socket connection failure.

In addition to above phases there are two more special microtask queues, these microtask queues takes highest priority than above phases. First microtask queue contains callbacks related to process.nextTick() and the second microtask queue handles promises. Among these two microtask queues, process.nextTick() has higher priority.

Let's consider below example to understand the priority of the queues execution.

const fs = require('fs');
function one() { console.log(1); }
Promise.resolve().then(() => console.log(2));
fs.readFile(__filename, () => { console.log(3) });
setTimeout(() => {
    console.log(4)
}, 1000);
process.nextTick(() => console.log(5));
setImmediate(() => console.log(6));
console.log(7);
one();

First 7 gets printed as it is synchronous code in global context. Next one() function gets called and prints 1. Now as explained above process.nextTick() microtask queue has highest priority, so the callback in this queue gets executed which prints 5. Next priority will be promise microtask queue which prints 2. This completes the current poll phase. Although fs.readFile is part of the poll phase the callback won't be placed immediately in the queue in the current poll phase as file reading takes some time, so only in the next poll phase the callback related to readFile will be executed. Now the event loop enters check phase which executes setImmediate callback, so it prints 6. There is no code related to Close phase, so the it enters next phase which is timers, although there is a callback function related to setTimeout place in timers queue it won't be executed as there is a wait time of 1000 ms, so this will be executed only in the next cycle. There is no callbacks related to pending phase which concludes the first cycle and now again the second cycle begins in which the event loop again enters poll phase and prints 3 and then goes through the same phases and finally prints 4 that got placed in the timers queue. So the final output is 7 1 5 2 6 3 4.

NODE.JS ARCHITECTURE

Screen Shot 2021-01-31 at 12.15.56 PM.png

In the above architecture diagram, the Node.js core API is written in javascript, all the remaining internal components are written in c++. Although the javascript engine is single-threaded, Node.js internally is multi threaded. libuv maintains a thread pool to manage I/O operations and other CPU intensive operations like crypto etc. The thread pool size defaults to 4 which can be configured using UV_THREADPOOL_SIZE. V8 is the chrome javascript engine which contains event loop.

REFERENCES

nodejs.org/en/docs/guides