Advanced Server Access gateway high availability
Gateways use various techniques to ensure high availability, which provides more reliability than traditional SSH bastions:
- Gateway load balancing
- Gateway status checks
- Configure a temporary gateway bypass
- Troubleshoot a gateway
Gateway load balancing
For projects with multiple gateways, Advanced Server Access routes traffic through a single gateway randomly selected when a client connects to a server. This allows Advanced Server Access to balance requests across any available gateways.
If a specific gateway becomes unavailable, Advanced Server Access may continue to route requests to the gateway for up to five minutes. During this time, all requests fail. After five minutes, Advanced Server Access removes the gateway from the pool and begins routing connections to other available gateways.
To ensure that requests are handled properly, you should configure gateway load balancing similar to how you'd configure a web application server behind a load balancer. This generally involves the following tasks:
- Ensure that clients can connect to the load balancer (port 7234 by default).
- Ensure that the load balancer can connect to gateways (port 7234 by default).
- Ensure that gateways can connect to your target servers (port 22 by default).
- Configure every gateway to use the same AccessAddress. This must match the domain name or static IP used to access the load balancer.
Different ports are treated as different addresses for clients to connect to and validate the identity of a gateway. Host certificates for multiple gateways are considered valid if they share the same default address. This is detected using cloud instance metadata for servers hosted on Amazon Web Services (AWS) or Google Cloud Platform (GCP).
After configuring gateways behind a load balancer, Okta recommends adding other health checks to remove gateways from the pool after they become unhealthy or unreachable.
- Ensure that gateways are listening on the correct port (7234 by default).
- Ensure that gateways have sufficient space to store logs.
- Ensure that gateway CPU and memory usage is operating normally.
See the documentation for your specific cloud provider or load balancer for additional information on how to best implement health checks for your platform and tools.
Gateway status checks
Gateways automatically report health status to Advanced Server Access every two minutes. This status is used to provide valuable insights that can help control traffic routing when users attempt to connect to a server.
If Advanced Server Access doesn't receive a status from a specific gateway for more than five minutes, the gateway is removed from the pool and connections are sent to other available gateways. If Advanced Server Access doesn't receive a status report from any gateways for five minutes, connections are sent to the gateway that most recently reported. If Advanced Server Access doesn't receive a status report from any gateways for 24 hours, all attempts to connect will fail and you may need to perform other configuration or troubleshooting.
You can find the most recent health data by viewing the gateway details from the Advanced Server Access dashboard.
Configure a temporary gateway bypass
In the unlikely event that every gateway is inaccessible, you can configure access to servers that bypass gateways but still allow for restricted and audited connections:
- Create a temporary Advanced Server Access group containing only users who need temporary access.
- Identify the project where the server is enrolled and remove every group.
- Add the temporary group to the project.
The new group is synchronized to servers. This process may take several minutes, after which time members of the temporary group can access the server. After the incident is resolved, you must manually restore the original groups.
Session capture isn't possible for connections that bypass the gateway. However, user and connection time information is still recorded to the Advanced Server Access audit log.
Troubleshoot a gateway
Teams can troubleshoot gateways by installing the Advanced Server Access server agent and enrolling the gateway in a project that doesn't require a gateway. Provided the server is active and accessible, users who belong to the associated project can connect to the gateway through SSH.