A practical overview of how Nodejs single-threaded event loop model actually works
I read many articles on how NodeJs works internally but I didn’t find any satisfactory answer and hence I decided to write an article myself with practical examples so that people can actually run the snippet and play with it.
Primarily I’ll be covering the main components of this architecture like ‘Event Loop’, ‘Event Queue’, and few concepts like ‘blocking’ & ‘non-blocking’ codes.
I’ll take an example of an HTTP API endpoint written in express (the most popular framework of NodeJs). Towards the end, I’ll also try to explain why it is said not to use NodeJs for CPU intensive applications (because it is completely related to what we’re going to discuss)
Event Loop: An event loop is an endless loop, which waits for tasks, executes them, and then waits until it receives more tasks. The event loop executes tasks from the event queue only when the call stack is empty i.e. there is no ongoing task.
Event Queue: Event Queue is the list of pending callback functions to be executed. Once an operation is completed, a callback assigned to that operation is added to the event queue which will eventually be picked by the event loop
Nodejs single-threaded event loop architecture: As there is only a single thread, only one task can be executed at a time and a task could be anything it could be listening for new requests via event loop or it could be the execution of statements, lets dive deep into the example
A function that can keep looping for given milliseconds (can keep the event loop busy)
function sleep(milliseconds) {
const date = Date.now();
let currentDate = null;
do {
currentDate = Date.now();
} while (currentDate - date < milliseconds);
}
API 1: it can keep the event loop busy for ‘timeout’ milliseconds given with API request
router.get('/api1', function(req, res, next) {
let __start_time = new Date().valueOf()
sleep(req.query.timeout)
let __end_time = new Date().valueOf()
return res.jsonp({ "name": "api1", 'start_time': __start_time, 'end_time': __end_time, 'execution_time': __end_time - __start_time, 'start_time_readable': new Date(__start_time).toISOString(), 'end_time_readable': new Date(__end_time).toISOString(), 'timeout': req.query.timeout })});
API 2: basic API which will respond after ‘timeout’ milliseconds given with API request
router.get('/api2', function(req, res, next) {
let __start_time = new Date().valueOf()
setTimeout(function() { let __end_time = new Date().valueOf()
return res.jsonp({ "name": "api2", 'start_time': __start_time, 'end_time': __end_time, 'execution_time': __end_time - __start_time, 'start_time_readable': new Date(__start_time).toISOString(), 'end_time_readable': new Date(__end_time).toISOString(), 'timeout': req.query.timeout }) }, req.query.timeout)
})
Script to call these APIs programmatically
var request = require('request');
let async = require('async')let request_p = function(options) {
return new Promise(async function(resolve, reject) {
try {
request(options, function(error, response) {
if (error) throw new Error(error);
if (typeof response.body == 'string') {
response.body = JSON.parse(response.body)
}
return resolve(response.body)
});} catch (e) {
return reject(e)
}
})
}let __combination = async function(options) {
async.parallel([function(cb) {
console.log("section 1 start")
let options1 = {
'method': 'GET',
'url': options.req1
};request_p(options1).then(function(__res1) {
console.log("section 1 end", __res1)
return cb()
})}, function(cb) {
console.log("section 2 start")
let options1 = {
'method': 'GET',
'url': options.req2
};request_p(options1).then(function(__res1) {
console.log("section 2 end", __res1)
return cb()
})
}], function(err, res) {
console.log("section final")
process.exit(0)
})
}
Note: These many parameters are returned to support the explanation
Experiment 1:
__combination({"req1":"http://localhost:3000/api1?timeout=5000","req2":"http://localhost:3000/api2?timeout=2000"})
This will call API1 first with a timeout of 5000ms and in parallel, it’ll call API2 with a timeout of 2000ms
The output of Experiment 1:
section 1 startsection 2 startsection 1 end { name: 'api1',start_time: 1617686726344,end_time: 1617686731344,execution_time: 5000,start_time_readable: '2021-04-06T05:25:26.344Z',end_time_readable: '2021-04-06T05:25:31.344Z',timeout: '5000' }section 2 end { name: 'api2',start_time: 1617686731382,end_time: 1617686733399,execution_time: 2017,start_time_readable: '2021-04-06T05:25:31.382Z',end_time_readable: '2021-04-06T05:25:33.399Z',timeout: '2000' }section final
“section 1 start”, “section 2 start” is fine but wait!! after that -> section 1 end
how? while API1 was invoked with 5s and API2 was invoked with 2s so ideally, API2 should respond first. but no and that’s where the secret is “how NodeJs internally works” on a single thread. Let’s understand this in detail
- Invocation started for API1 first, so obviously, it reached to event loop first and the event loop was available and immediately started its execution
- As the nature of the API is blocking, its a for loop running for given seconds so the event loop will be busy executing this and it’ll not be available for taking new requests for a given timeout value (for this example 5s)
- After completion of these 5s, the next statement is to return the response an event loop will follow that and the response will be returned
- Now event loop is available to take new requests and you can see that the start time of API2 (2021–04–06T05:25:31.382Z) is after the end time of API1 (2021–04–06T05:25:31.344Z)
- And API2 will start execution and as this is a waiting call, after given timeout (2s in this case) callback function will be given to the event queue and event loop (who will be free to take items from the event queue) will dequeue this function and will execute it and will return the response (as written in the code)
- From the timeline perspective, if API1 and API2 are invoked at the 0th second in this sequence (first API1 and then API2 in parallel) then API1 will respond at 5th second, and API2 will respond at 7th second.
Now let’s do another experiment and reverse the order of APIs
Experiment 2:
__combination({"req1":"http://localhost:3000/api2?timeout=2000","req2":"http://localhost:3000/api1?timeout=5000"})
This will call API2 first with a timeout of 2000ms and in parallel, it’ll call API1 with a timeout of 5000ms
The output of Experiment 2:
section 1 startsection 2 startsection 2 end { name: 'api1',start_time: 1617686771983,end_time: 1617686776985,execution_time: 5002,start_time_readable: '2021-04-06T05:26:11.983Z',end_time_readable: '2021-04-06T05:26:16.985Z',timeout: '5000' }section 1 end { name: 'api2',start_time: 1617686771983,end_time: 1617686776991,execution_time: 5008,start_time_readable: '2021-04-06T05:26:11.983Z',end_time_readable: '2021-04-06T05:26:16.991Z',timeout: '2000' }section final
“section 1 start”, “section 2 start” is fine but wait!! after that -> section 2 end
how? while section 1 was invoked with 2s and section 2 was invoked with 5s so ideally, section 1 should respond first. Let’s understand this in detail
- Invocation started for API2 first, so obviously, it reached to event loop first and the event loop was available and immediately started its execution
- As the nature of this API (API2) is non-blocking, so the event loop will start its execution and will be free from this and it’ll be ready to take the next requests
- At this time API1 will be reached to the event loop and will start its execution
- As the nature of the API (API1) is blocking, it a for loop running for given seconds so the event loop will be busy executing this and it’ll not be available for taking requests for the event queue for a given timeout value (for this example 5s)
- After completion of these 5s, the next statement is to return the response an event loop will follow that and the response will be returned for API1
- Now event loop will be free to take requests from the event queue where the timeout function of API2 will be waiting from the last 3s, it will dequeue this function and will execute it and will return the response (as written in the code)
- From the timeline perspective, if API2 and API1 are invoked at the 0th second in this sequence (first API2 and then API1 in parallel) then API1 will respond at 5th second, and API2 will respond at 5th second (API2 will respond immediately after API1)
I hope now the internal working of NodeJs is clear and it is also clear that why not to use NodeJs for CPU intensive applications.
I’m attaching code-snippet
That’s all for today. Thank you so much for reading!
Stay home, stay safe!