Swarm Ingress with frr

I’m running a 3 node Docker Swarm cluster running one manager and two workers. I’ve been in the process of gradually migrating my services over to the cluster. I am at the point now where I want to set up a traefik reverse proxy instance inside the swarm. I’m currently running two traefik reverse proxy instances on standalone docker instances, one for internal services and the second handles external services that come in via a Cloudflare tunnel in my DMZ network. I’m migrating the internal instance to the swarm cluster. I am at the point now where I am thinking about how to handle cluster ingress. There are a few options available:

The built in swarm ingress mesh
External Reverse Proxy / Load Balancer
Reverse Proxy Swarm Service

Swarm Ingress Options

The built in ingress mesh is the built in ingress mode. Ports are published and docker will route to the correct container, even if it is on a different node. I’m running a few services with this configuration by pointing service dns records to the manager node. In an external reverse proxy server configuration, the reverse proxy runs outside of the swarm and with backends that point to the published ports. You can configure multiple backends for each service. Each backend will point to a swarm node. Once your container is up and running in the swarm there will be a path for the traffic to get to it. My external reverse proxy instances manage my SSL certificates. Services that require SSL will be added to an external reverse proxy instance. Running the reverse proxy as a swarm service is my preferred method. Traefik, my reverse proxy of choice is swarm aware and it supports using labels for routing. Adding docker labels to a swarm service is all that is required for it to be added to the traefik configuration. One of my planned upgrades is promoting the two workers to managers which will result in a 3 manager swarm cluster. How do you handle the routing of traffic between swarm nodes? Currently swarm1 takes all the traffic and if I needed to perform maintenance on swarm1, all my services would be unavailable unless I repoint DNS to another swarm node. There are a couple of ways to handle this.

External load balancer
Floating Virtual IP address
Reverse Proxy with BGP/ARP-based external IP announcement

An external loadbalancer allows you to point to all the swarm nodes and when one becomes unhealthy it gets removed from the pool and your services work via the other healthy nodes. I have done this in the past with HAProxy running on my pfSense router. Traefik can do this as well. You can deploy a floating Virtual IP with KeepAliveD. You would point your DNS records at the VIP and if the preferred node becomes unhealthy, the VIP is moved to a secondary swarm node with minimal disruption. I have done this in the past with two traefik instances running on separate docker instances. I could reboot the preferred docker instance and the VIP would move over to the secondary instance. Worked really well for me. I ran Kubernetes with MetalLB in my homelab for a time. MetalLB ran inside the cluster and you had a pool of ip addresses that was inside your network address space but outside of your DHCP pool. You would be able to assign an IP address from this pool to a service and once you point your service DNS record to that address the service could run from any node in the cluster. That was my favorite thing about Kubernetes. I ditched Kubernetes as I found it overly complicated for my homelab needs. I was randomly asking some ai if there is an equivalent MetalLB service available for swarm and it made this recommendation for me.

ARP BGP

FRR will run on each swarm manager with host networking. The host networking part is important. Each Swarm Manager is configured as BGP peers and will peer with my OPNsense router. There is a controller script that talks to the docker socket and monitors swarm services.

Any service with the label lb.ip running on that manager will
- announce the ip as a route
- the address will be added to the lo interface
- The routes get pushed to OPNsense with the default gateway of the swarm manager

Router

When the service gets removed from the swarm manager
- The address is removed from announcement
- The address gets removed from the swarm manager loopback interface
- The route gets removed from OPNsense

Router

I’ve been POCing the setup on my one manager node with one instance of Traefik and it is working nicely. The best part is that ChatGPT wrote the controller script with the context that I gave it. Apart from some issues with different versions of FRR on OPNsense and FRR that prevented BGP peering and getting FRR to use docker host network, the entire setup was completed in two evenings.

Traceroute

The setup was a lot of fun for me especially as someone who finds BGP to be intimidating. The router was running FRR version 10.4.1 and swarm FRR version 8. When I upgraded swarm FRR to version 10.4.1 I asked ChatGPT to update the config to version 10.4.1 and it explained all the changes.

Generative AI is game changing.

The next step is to promote the workers to managers.