Horizontal scaling refers to adding more instances of the application to distribute the workload across multiple servers or containers.

The following points describe the key considerations for horizontal scaling:
  • Stateless design—Multiple replicas can run safely. Ensure that all replicas use consistent configuration and the same public key.
  • No sticky sessions—Token exchange caches, if present, are small and ephemeral. This eliminates the need for session affinity.
  • Load balancer readiness—Place the application behind an L4 or L7 load balancer. Ensure that the TLS termination strategy is consistent, whether using end-to-end encryption or edge termination.