How Nginx Handles Thousands of Concurrent Requests

Understanding Event-driven Non-blocking Architecture of Nginx

Nov 29, 2024

Traditional Approach - One Request Per Process

In traditional web servers, each request is assigned a separate thread (or process) to handle concurrent requests. These threads waste computational resources such as memory and CPU by blocking (waiting) for requests to complete during network or I/O operations. Once requests are complete, the corresponding threads are destroyed and recreated when new requests come in. This destruction and creation of new threads require additional computational resources, resulting in poor performance.

Fig. Traditional Web Server Approach - One Request Per Process

Request Flow

The server listens for new connection requests.
When a new request comes in, the server accepts it, creates a new dedicated process, and assigns the request for processing.
The process continues to wait (block) for external operations like disk or network I/O to complete. This may happen multiple times during the request's processing.
Once the request is completed, the process waits to see if the client plans to start a new request (keep-alive connection). If the client closes the connection or a timeout occurs, the server destroys the process and listens for new connection requests.

Nginx Architecture

Nginx (Engine-X) is designed with four major components:

Master Process — Responsible for loading the server configuration and creating child processes (the next three types).
Cache Loader — Loads the disk cache to memory and then exits.
Cache Manager—Runs at intervals and removes unneeded (or expired) entries from the disk cache to keep its size within the configured limit.
Worker Process — Handles network connections, network, and disk I/O. It does all the work! Upon server start, the master process initializes multiple worker processes based on server configuration.

Nginx Approach — Event-Driven

Unlike traditional servers, Nginx doesn’t create a separate process or thread for each incoming request. Instead, each worker process listens for the events generated by new incoming requests. Workers accept the connection requests and start processing.

However, instead of blocking (or waiting) for the response during network and disk I/O, the worker immediately accepts another connection request from the incoming pool or proceeds to continue processing requests that have completed their disk and network I/O and are waiting for further execution.

Why is this Faster?

In traditional servers, where a process is created per connection request, each process requires CPU cycles. Context switches provide these CPU cycles to each process. A context switch requires saving the current state of a running process to be restored later and then loading the state of another process to run it. Context Switching and the additional overhead and resources required to create a process for each connection request impact the server's overall performance.

This isn’t the case with Nginx, as a fixed number of worker processes (equal to the number of CPU cores) handle all the incoming requests. This reduces the context switches, as each worker is assigned a CPU core. As previously mentioned, this also avoids the additional resource consumption and overhead of process creation because worker processes are created at the start of the server. This allows Nginx to handle hundreds of thousands of connections per worker process.

References

Inside NGINX: How we designed for Performance & Scale – NGINX Community blog. (2015, June 10). https://blog.nginx.org/blog/inside-nginx-how-we-designed-for-performance-scale
The Architecture of Open Source Applications (Volume 2)nginx. (n.d.). https://aosabook.org/en/v2/nginx.html

The Scalable Thread

Discussion about this post