RADIUS session persistence
Use Session Persistence where possible
When deploying Okta RADIUS Server Agent with a load balancer, Okta recommends using session persistence (aka sticky sessions) based on the end-user’s VPN client or IP to optimize performance, especially in situations where waiting for user input to 2FA challenge is done off-band (e.g. Okta Verify w/ Push). The Okta RADIUS Server Agent handles de-duplication of requests from the originating RADIUS client. However, if the requests are spread between multiple agents, they are only de-duplicated at Okta service side resulting in unnecessary load for both the RADIUS Server Agents and the Okta Service; this extra load would also count against your rate limits on the Okta Service. Recommended configuration for stickiness is generally using the Calling-Station-ID combined with the Framed-IP. Calling-Station-ID for many VPNs will be the client IP address of the originating client. If a different RADIUS attribute is storing the client IP address, then configure the load balancer to use that attribute instead.
While we recommend a load balancer as it provides high availability and horizontal scale, it is possible to deploy the RADIUS Server Agent behind a load balancer without persistence, and this is still preferable to not using a load balancer at all, but readers should be aware that this model will forfeit most of the benefits of request de-duplication Okta RADIUS Server Agent performs at the agent level.
RADIUS uses connectionless UDP protocol and most clients will automatically resend requests on a periodic interval until they've received a response from the RADIUS Server Agent. If these "retries" get load balanced to different RADIUS Server Agents, each agent is going to be simultaneously doing the same work (processing the same RADIUS request), and the first one to get a response from Okta and send a response back to the client will "win".
Normally the first RADIUS Server Agent to receive the original request will be the first to respond, because it will make the call to Okta and get the response back before a retry from the client is ever issued. However, when using Okta Verify with Push factor, the RADIUS Server Agent which receives the request will sit and poll Okta until the user confirms/denies the push request on their phone. During this time, the RADIUS client is likely to send retries of the same push MFA request. In this scenario, if the retries are sent to the same RADIUS Server Agent, then the agent is smart enough to recognize those as duplicate packets and drop them immediately. But if the retry is load balanced to a different RADIUS Server Agent, that agent will process the request as a net new request and initiate the push notification again. In order to minimize the effects of this behavior, Okta recommends that you set the RADIUS client retry interval to 30 seconds or higher if you deploy in a load-balanced environment that does not support stickiness. This generally gives the end-user enough time to receive the push notification and respond to it before the RADIUS client starts sending retries.
Another possibility for race conditions (in the absence of load balancer persistence) is if a particular RADIUS Server Agent becomes backlogged with a large queue of requests. This can happen if there are not enough worker threads configured on the agent, or if those threads are all consumed by long-running requests such as Okta Verify with Push or slow responses from the Okta service such as where the Okta service has to round-trip back into your on-prem Active Directory agent in order to authenticate the user, and then respond back to the RADIUS Server Agent which then has to respond back to the RADIUS client. In this case, retries again are a concern because if they are load-balanced to other agents, it depends on which agent gets around to processing the request first. Again this is generally safe no matter which agent "wins", but it will be harder to debug the system as a whole.