# Capacity planning and sizing

Determining required capacity in your Okta Access Gateway implementation is crucial to achieving performance.

For concepts on capacity planning and sizing, see About Access Gateway capacity planning and sizing.

## Estimating access rates

Average access rates represent a general lower bound on how many accesses a given instance of Access Gateway needs to support. We can estimate average access rates by looking at sets of users that access the system.

To estimate the average access rate, determine:

• Total users - How many users does this instance serve in total?. Total users represents all users who might ever access the gateway.
• Estimated daily users - The percentage of users who actually use an application in a given day.
• Estimated daily accesses - The number of times a given user accesses an application in a given day.
• Page accesses per sign in - For a given set of authenticated users, how many page accesses are expected during a single session?

With these concepts in mind we can estimate an average authentication rate as:

Average users = Total users / Estimated daily users

Average accesses = Average users / Estimated daily accesses

We can then extrapolate overall accesses by examining:

Overall accesses = Average accesses * Page accesses.

For example, consider three groups of users, each accessing the system, but at different levels.

• Frequent users - Frequent users access the system regularly, typically multiple times per day.
• Infrequent users - Infrequent users access the system on occasion but with a much lower frequency.
• Rare users - Rare users access the system a maximum of 1-3 times a week.

For example:

Assume the total number of users as 10,000.

Frequent average accesses = 10,000 * the number of frequent users

If we assume that 50% of users are frequent users, then we have a baseline of 5,000.

Frequent users typically access the system at least 5 times per day. We can calculate frequent users as:

Frequent users * accesses/day = 5 * 5,000, or 25,000.

Infrequent users are defined as those that access the system 2-3 times a day and represent another 25% of the user base.  Accessing the system twice per day.

Infrequent users * accesses per day = 10,000 * 25 = 2500 in frequent users each of which accesses the system three times, up a total of 7,500 accesses per day.

Rare users represent the remaining 25% of the user base. These users access the system a maximum of once a day, but typically only access the system every several days.

Rare accesses =

2,500 * 1 * .5 (total rare users, * total accesses * rarity of access, of once every other day)  For a total of 1,250 total accesses.

We can then estimate peak daily uses as:

• Frequent accesses: 25,000
• Infrequent accesses: 7,500
• Rare accesses: 1,250
• For a total of accesses per day of 33,750.

## Sizing

When calculating sizing, consider the following areas:

• Cores - Total number of CPUs/cores.
• RAM - Typical memory requirements.
• Storage - How much disk space is required? Primarily for logging purposes.
• NIC - What are throughput requirements?

Terms and definitions:

Term

Description

Application

A reference to an application as defined by the Access Gateway Admin UI console and listed on the Applications page.

Authentication

Authentication or AuthN, is the process of establishing identity. Authentication occurs the first time a user access a given application within Access Gateway. Authentication may also occur at other times when accessing application resources.

Authorization

Authorization, of AuthZ, is the process of determining access rights to a given page or resource. Authorization occurs every time a user attempts to access an application resource such as a page

Session

Access Gateway session or simple session, refers to the information maintained and used by Access Gateway. Typically this includes all the information in a traditional HTTP(s) session as well as Access Gateway specific session data, such as attributes and possibly Kerberos tickets (when in use).

### Memory Sizing

Access Gateway appliance memory use is divided into:

• OS, Access Gateway engine, and micro-services, with 1.5GB considered as the minimum for production environments.
• Cached Sessions: 128MB minimum.

Since OS, core Access Gateway, and micro-services memory is fixed, determining memory requirements is primarily focused on cache session sizing.

To determine memory size examine:

• Total sessions - The maximum number of in memory sessions at any given time.
• Average session size - The average expected size of any given session.

Total sessions are calculated using:

• Number of users
• Percentage of user sign-in events per day
• Applications accessed
• or

• Total sessions = #users x % sign in per day x applications accessed

Session size is a function of:

• Application session and application attributes, with default size of ~1024b
• and

• Kerberos tickets (where applicable), ~1024b, but are often larger based on number of IIS applications accessed

Session cache then becomes:

• Session cache = Total sessions * (average session size * 2)

For example:

Web Application - Session Cache
Users Percentage sign-in events/Day

Applications Accessed

Total Sessions(Users % logins/day * applications accessed)

Session Size

Session Cache

5,000 50%

5

12,500

1024B*2

~25MB

10,000

75%

10

75,000

1024B*2

~150MB

25,000

50%

100

125,000

1024B*2

~500MB

Kerberos Apps - Reserved
Users Percentage sign-in events/Day

IIS Applications accessed

Total Sessions(Users * logins/day * applications accessed)

Session Size

Session Cache

10,000

50%

5

25,000

1024B

~50MB

Total application memory should then include at least 1.5GB for fixed requirements and session cache plus Kerberos requirements.

Session considerations:

• Sessions are cleared using a Least Recently Used (LRU) algorithm .
When cache is full and new sessions are created, the oldest idle session is removed.
• Session Monitoring logger raises alerts for cache near full and full conditions.
you can find statistics in the management console.
Consider increasing appliance memory to reduce cache full situations.
• Always consider peak session usage situations and plan accordingly.
Consider peak conditions and size for those conditions. For example, consider load, such as the time of year when employees are enrolled, or mornings and after lunch sign-in events and similar situations.

### Hard Disk Sizing

Overview

Access Gateway requires hard disk for software, system logs/log archives, and backups.
Disk use comprises of:

• Software, including operating system, and Access Gateway.
• Backups, performed nightly and retained for 30 days.
• System log output, spooled to local disk, and including Audit, Access and All Log files.
• Log archives, maintained for 30 days, rolled and compressed.

Software and backup size requirements are typically small making the primary consideration log sizes.

Log Entries

Log entries primarily contain session information, Authentication (AuthN), and Authorization (AuthZ) content. Typically, one entry per HTTP(s) request.
In order to correctly size a disk, the number and size of these entries over a given time period must determined.
To determine the size of log entries, you must consider the number of system users and the count of times users access applications (AuthN entries) and the number of subsequent page views for that application (AuthZ events).

Overall, the composition of a log entries is based on:

• Session - Each log entry includes Access Gateway session information.
• AuthN - Authentication audit and logging information.
• AuthZ - Authorization log information, including resource accessed and policy rules.

From a disk use perspective, the size of each of these is the other important question.

Examples

Let's look at an example.
If a given user base is 10,000 users, of which 75% access the system per day, and each user accesses 10 applications on an average. In this case, you can determine average daily accesses as:

• Active Users = total users * access percentage
10,000 * 75% = 7500 active users.

Each user accessing 10 applications:

• Accesses per day = Active Users * applications accessed
7500 * 10 = 75,000 accesses per day.

Assuming that each access requires a session, authentication, and authorization, you can then determine an estimated log entry size as:

• Log entry size = Access Gateway session size + AuthN size + AuthZ size + some small formatting overhead.

For a more realistic example, let's assume Access Gateway sessions, AuthZ, and AuthN sizes are roughly the same and ~1024B each. Then, each access would require approximately 3K bytes.

Assuming that access is more or less evenly distributed over the course of 24 hours, there are approximately 75,000/24 or 3125 accesses per hour.

If each access measures 3K bytes in size, then the hourly growth of a log becomes 3k*3125 or 9.6MB/hour or approximately 230mb a day. Assuming a consistent access pattern, you would need ~7,000MB of disk/month for log and log related content..

A reasonable rule is to allocate twice expected consumption plus additional overhead space for software updates, configuration, and backups. In the given example, there would be a disk requirement of approximately 14GB plus 10-20% or roughly 17-20GB/month.

Hard Disk considerations:

• Monitor logger alerts on low disk to avoid low disk warning size for maximum or peak requests. The check runs hourly and gives warning for 70% and alerts for 90%.
• Every HTTP request results in audit and access logs.
• Faster disk IO improves throughput.
• Session size affects audit logging with authorization and audit logs contain session contents.
• Don’t be conservative in Hard Disk sizing, allocate 2x estimated disk requirements to avoid burst and large page requests resulting in low disk warnings.

### CPU Sizing

The Access Gateway engine autoscales across CPUs, which results in a worker per CPU. Each additional core results in an additional thread allowing for additional processing.

CPU considerations

• More CPU/Cores will improve capacity.

• Network throughput is typically the bottleneck.

### Throughput Sizing

Throughput is a direct function of AuthN, AuthZ, and return content.

• AuthNs = SAML assertions processed (from Okta to each application).
• AuthZs = Policy check per HTTP request (all HTTP requests).

Assuming the following values (see logs for actual values):

AuthN Bytes AuthZ Bytes Returned Data
1024B 1024B 2048B

Network throughput becomes a function of:

• AuthZ requests/second
• Average returned data size

Total network throughput then becomes a function of:

Assuming that the dominant factor in network access is the amount of data returned per request and an average response is~20KB. For 500 requests the result becomes 20KB * 500 or 10 MB/s.

Simplifying:

• Average network bandwidth = Average response size * Average request arrival rate.

Network Requirements
Requests/ second AuthN Size

AuthZ Size

Returned Data

Total

500

1024B

1024B

2048B

~20MB/S

Exact timing information can be found in the AuthN and All logs. Total time to perform a request and return data is also tracked.

## Instance Sizing

Consider the following table when sizing instances:

Use Physical/virtual hardware AWS Equivalent

Proof Of Concept

1 instance of
2 cores at
2G memory, 220G(default) HD, each with single 1 Gbps NIC each

t2.medium

Small 2 instances of
2 cores at
4G memory, 220G HD, each with single 1 Gbps NIC each
t2.medium
Medium 3 instances of
2 cores, at
8G memory, 500G HD, each with single 1 Gbps NIC
m4.large
Large 3 instances of
4 cores, 16G memory, 500G HD each with single 1 Gbps NIC
m4.xlarge

See AWS Instance Types.

## Scaling

Scaling is the process of increasing or decreasing an Access Gateway cluster size.

Clusters can be:

• Scaled vertically - Adding or removing memory, disk or CPU from a given instance.
• Scaled horizontally - Adding or removing Access Gateway instances from a cluster.

Okta recommends defining all Access Gateway high availability cluster members similarly with the same CPU, memory, and disk configurations.

When examining cluster performance, consider the following:

• For a given instance, the best performance increases can be made by adding CPU or using solid state disk.
• To improve overall cluster throughput, Okta recommends horizontal scaling or adding additional Access Gateway instances.
For example: For a two node cluster that handles 1500 requests, you can double the capacity by adding two additional nodes with the same CPU, memory, and disk configuration.
• In general, horizontal scaling is linear due to Access Gateway's use of sticky sessions (session affinity). Access Gateway does not share sessions between nodes.

Capacity may be limited by other factors not related to Access Gateway, such as network throughput or the back-end application performance.