Josh Choo's blog

DNS load balancing

12 September 2021

I've been learning how to implement load balancing using NGINX and Docker Compose. At first, I opted to deploy my backend as several different services and point NGINX at all them using the upstream directive. It looked something like this:

version: "3"
services:
  load_balancer:
    image: "nginx:1"
    restart: "always"
    ports:
      - "8080:80"
    volumes:
      - ./load_balancer/nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - "server1"
      - "server2"
      - "server3"
  server:
    build: "./server"
    expose:
      - "9000"
  server2:
    build: "./server"
    expose:
      - "9000"
  server3:
    build: "./server"
    expose:
      - "9000"
events {}

http {
  upstream app-server {
    # Map to running containers
    server server1:9000;
    server server2:9000;
    server server3:9000;
  }

  server {
    listen 80;

    location / {
      proxy_set_header Host $host;
      proxy_pass http://app-server;
    }
  }
}

Replicas

Scaling up and down would be laborious because we would need to modify both the docker-compose.yml and nginx.conf files to add and remove servers.

It turns out that Docker Compose provides a more convenient solution to deploy multiple instances for a single service and scale them dynamically. Instead, we can define the number of replicas for any service, and we can simplify our NGINX config to reverse-proxy requests to a single endpoint:

version: "3"
services:
  load_balancer:
    image: "nginx:1"
    restart: "always"
    ports:
      - "8080:80"
    volumes:
      - ./load_balancer/nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - "server"
  server1:
    build: "./server"
    expose:
      - "9000"
    replicas: 3
events {}

http {
  server {
    listen 80;

    location / {
      proxy_set_header Host $host;
      proxy_pass http://server:9000;
    }
  }
}

Resolving Domains

How does NGINX determine which server instance to reverse-proxy a request to?

When we use the upstream directive NGINX defaults to Round Robin load balancing, which distributes the requests evenly between the servers.

But in the replicas example above, we only configured NGINX to reverse-proxy requests to http://server:9000. How does NGINX know how many server instances we have deployed, where they are located, and how to send requests to them?

When NGINX wants to forward a request to http://server:9000, it needs to know the server's IP address. It checks with the default DNS server, which is specified in /etc/resolv.conf. In this example the DNS server is 127.0.0.11, which is Docker's DNS server. Since Docker knows the IP addresses of all the server instances, it provides them to NGINX as a list of A Records (assuming we use IPv4). We can inspect these records by running tools like nslookup or dig in any of the Docker containers:

$ nslookup server
Server:		127.0.0.11
Address:	127.0.0.11:53

Non-authoritative answer:
*** Can't find server: No answer

Non-authoritative answer:
Name:	server
Address: 172.19.0.2
Name:	server
Address: 172.19.0.4
Name:	server
Address: 172.19.0.3

Using the above A Records, NGINX can reverse-proxy requests to 127.19.0.2:9000, 127.19.0.3:9000 and 127.19.0.4:9000 using round robin load balancing.

Scaling and Service Discovery

Using the replicas approach, we can scale the number of server instances using the following command. No need to modify docker-compose.yml or nginx.conf!

$ docker-compose up --scale server=10

However, an important gotcha is that NGINX caches the A Records returned by the Docker DNS server when it first starts up, and will not update these records with the DNS server on subsequent requests (based on our configuration above). Consequently, NGINX wouldn't know if we scaled up or down the number of server instances, and it would be holding on to stale IP addresses. This is the problem of service discovery.

One solution is to reload the NGINX configuration after scaling so that NGINX will re-fetch the latest DNS A Records:

$ docker-compose exec load_balancer nginx -s reload

Not very convenient though. Imagine having to reload configuration manually whenever we scale up or down :S

We could instead configure NGINX using the resolver directive to re-resolve the domain name when the Time-To-Live (TTL) expires:

events {}

http {
  server {
    listen 80;


    location / {
      resolver 127.0.0.11;
      set $backends http://server:9000/;

      proxy_set_header Host $host;
      # using a variable with proxy_pass will instruct NGINX to re-resolve the domain name when the Time-To-Live (TTL) expires.
      proxy_pass $backends;
    }
  }
}

Much more convenient! I wonder why this behaviour isn't the default.