Loading ...

Would you like to discuss your environment with a knowledgable engineer?

Preferred communication *

Thank you. We will be in touch with you shortly

Performance testing aiScaler

Performance and Stress Testing of aiScaler Setups

Many website and system admins are the type of people that will only believe things when they see them for themselves. This is especially true when it comes to performance testing. One of the very first things folks want to do when testing anything that claims to improve performance of website is to put these claims to test. This appendix serves to point out some basic things about such stress and performance testing.

Enable Caching

Not much improvement can be seen unless you’re letting aiScaler to do what it is designed to do: cache content. Make sure you define some patterns with non-zero TTLs. A very simple pattern that would enable caching of every single URL for 60 seconds could be:

/ simple 60

Be aware that the first request, for any given URL, to a cold aiScaler instance would result in cache miss and the request would have to go all the way to the origin server to be filled. Subsequent requests, for 60 seconds in our example, would come from cache.

Obviously, you wouldn’t want to go to production with a cache-all pattern, but it is useful for testing purposes.

Tools Used to Generate a Simple Load Test

You can use Apache Benchmark or “ab” – it is normally installed along with Apache Web server and therefore could be found, free of charge, on almost any Linux system. To test a URL with 10 concurrent load threads, for total of 2000 requests, you’d run something like this:

–c 10 –n 2000 a.b.com/a_url.html

The report that ab produces is rather self-explanatory. You want to compare the before and after numbers of requests-per-second (RPS) and time-per-request, time-to-first-byte. It is also the easiest tool to use for some quick testing.

A more evolved tool for benchmarking is Jmeter. It is capable of running in a master-slave configuration, with main instance controlling a number of slave load generators. It has a flexible reporting API with a number of plugins available that can graph all kinds of stats in real time. You can create fairly evolved test scripts – crafting custom URLs based on values you read from files etc. Refer to Jmeter manual for more information.

Beware of Network Limitations.

Let’s say you have 2 servers. One is running aiScaler and the other one you use to generate the load from. Make sure you understand the network connectivity between the two. Are they on the same network/switch or are they a thousand miles apart? What is the max throughout you can have between the two?

A better case scenario would place both systems onto the same network, in the same datacenter and ideally on the same switch, at 1Gbps. Let’s say the network switch is not oversubscribed and can deliver full 1Gbps in both directions (using full duplex links). What it means is that you can expect to pass
about 100MBps in each direction.

Say the URL you’re testing is such that the response is 20KB in size. You fire up ab, point at aiScaler instance at the other server, tweak the number of load generating threads for a while, but quickly grow frustrated – the aiScaler doesn’t seem to be able to handle more than 5000 RPS, no matter what you do. Both servers are barely registering any load during the test, so you suspect something must be seriously wrong somewhere. A smirk might cross your face: another overhyped product, clearly incapable of reaching stated performance numbers, you’ve seen plenty of those before. Guess what! Nothing is wrong, it is just that at 5000 RPS, each at 20KB of response payload, you have fully saturated your network! Re-read the previous sentence till you fully understand what is happening.

Now, if the 2 servers are in different datacenter, you can end with much smaller RPS numbers. In more severe cases, you’d become limited not only by available bandwidth, but also by latency.

Beware of Load Generator Limitations

Let’s say you have 2 servers. One is running aiScaler and the other one you use to generate the load from. Both are on the same network/switch and you do have full 1Gbps of bandwidth available. The response size is mere 40 bytes, with header and all, so using simple math, you expect to see around 250,000 RPS when performing the stress test.

In reality, however, you will likely discover that a single load client is incapable of generating enough load to load up aiScaler server. You’ll likely need 4 or more load clients, to get to 250K RPS numbers.

General performance improvement suggestions:

Enable Caching and Compression. 
Assuming enough of content is cacheable, you will discover that in most cases, you are limited by bandwidth available between aiScaler servers and your clients.  By all means make sure you compress everything that is compressible. This way you can expect 4-fold+ increase in RPS numbers.
Increase available network bandwidth. 
Consider connecting aiScaler servers at 10Gbps. A less expensive proposition is to use Ethernet trunking (aka NIC teaming) in load-balanced mode – so with 4 1Gbps interfaces you can have 4Gbps of available BW (assuming the switches/routers are not overloaded). Of course we assume this is the BW available all the way to the end users. For example, having aiScaler servers connected at 2Gbps won’t do you any good if you only have 1Gbps Internet link and all of your users are on the Internet.
Increase system-wide and per-process number of file descriptors. 
Make sure to bump up number of open descriptors available to aiScaler servers if you expect to serve any serious number of RPS, see aiScaler User Guide for more information.

Another limit on max number of open connections might come from a system-wide limit. To see what it is set to and change it, read/write from /proc/sys/fs/file-max file:

# cat /proc/sys/fs/file-max


# echo 512000 > /proc/sys/fs/file-max

# cat /proc/sys/fs/file-max



Increase number of per-process open file descriptors via ulimit command. 


ulimit –n 128000


Breaking the 64K open connections limit. 
Operating system would normally impose a limit of maximum of 64K open connection per server IP address. While you’re not likely to see it becoming an issue under most traffic conditions, extreme traffic volume might require maintaining higher number of open connections.  If that’s the case, you’d need to setup multiple IPs on a single aiScaler server (and remember that single of anything is never a good idea) or setup multiple aiScaler servers to handle the traffic. You can then direct the traffic at multiple IP addresses via multiple DNS A-records.


Decrease TIME_WAIT timeout. 
When running such high-connection rate websites, you will discover that at any point in time you might have thousands upon thousands of connections in TIME_WAIT state. Now such connections do not translate to any extra load on aiScaler, yet should they create a problem for your setup, you can try reducing number of such connection by turning on client_linger and os_linger options.
With these options set, the TCP/IP connection close takes a shortcut -instead of an orderly termination, a TCP/IP reset is sent instead and connection is disposed of immediately, without going through TIME_WAIT state. You must test this before enabling it in production setting. Some client browsers might not appreciate getting such TCP/IP resets none too much.
However, this should be much safer with origin server connections – as by the time aiScaler issues a reset, it has obtained a complete response from origin servers. In addition, it is not only aiScaler that will show reduced number of connections in TIME_WAIT state, origin servers will also see similar reduction. Yet again, please test before enabling it in production.
Alternatively and/or in addition , you can explore setting Linux’s own TIME_WAIT interval to a lower value (some heavy traffic sites set these it to as low as 1 sec):

echo 5 > /proc/sys/net/ipv4/tcp_fin_timeout



Client-Side Keep-Alive connections. 
When dealing with client requests that are such that a number of them are likely to be issued in rapid succession, we recommend enabling client-side HTTP keep alive.
Normally aiScaler does not force connection close after serving a response to a client, allowing clients to send more requests over already-established connection. This is knows as connection persistence or connection Keep-Alive feature. It has a potential to speed up user access to your web site by amortizing TCP/IP connection establishing overhead over many client requests.
Instead of closing client connection after serving a response, aiScaler indicates to clients, via Keep-Alive HTTP header, its Keep-Alive preferences as to for how long the connection could be kept open by the client. However not all clients (read: browsers) may obey this hint. In order to prevent abnormally high number of open, yet mostly idle, client connections having to be maintained by aiScaler and server’s operating system, you can configure aiScaler to drop idle Keep- Alive connections if there was no client input on particular connection for more than maxkeepalivetime seconds. Default value: 10 secs.
You can limit maximum number of requests allowed to be served over a single Keep-Alive client connection – use maxkeepalivereq directive in global and/or website sections of the configuration file. Default value: 20 requests.
aiScaler reports number of keep-alive requests served over each keep-alive client connection in its access log file. If you’re kind of person that likes to fine tune stuff, try setting different increasing maxkeepalivereq values till you the reported number of served keep-alive requests stops growing. Doing so will speed up the loading of the page for your visitors.
Sometimes you know that a certain request URL results on in a response that is unlikely to be followed by additional Keep-Alive requests from the clients. In this case letting connection to go into Keep-Alive state is a straight waste of resources – as it is never used again. For such URLs you can set conn_close setting in the matching pattern section.
You can see gauge the effectiveness of client Keep-Alive connections by observing average number of client requests per client connection – it is reported both in Global and Website sections of Web self-refreshing monitoring pages. The higher the number, the better the experience for your visitors, however, larger number might mean more open connections on your aiScaler serve, so you might need to adjust aiScaler’s client Keep-Alive settings to find a middle ground.


Server-Side Keep-Alive connections. 
aiScaler is capable of maintaining Keep-Alive connections to origin servers. Similar to client- side Keep-Alive connections, having OS-side Keep Alive connection allows you to speed up request processing time by amortizing TCP/IP connection establishing overhead over many requests. It is less beneficial however, compared to client-side Keep-Alives, as aiScaler and origin servers are frequently located not only in the same hosting facility/datacenter, but are attached to the same switch – with latency in microseconds, not tens or hundred(s) of milliseconds as often is the case with client-side connections. However, if the origin servers are hosted in a geographically remote Datacenter, a significant latency away, consider using OS Keep-Alive, it will make responses much faster. 
Please note that you should test Origin Server Keep-Alive feature before turning it on for production use. You must ensure that origin web servers support Keep-Alive connection (most do) and also configure it to persist for the reasonably long time and number of request, to realize maximum benefits.
To specify maximum per-origin-server number of open Keep-Alive connections to maintain, set it via:

max_os_ka_connections NNN

To specify maximum number of requests to be allowed per Keep-Alive origin server connection, set it via:

max_os_ka_req MMM

Please note that you might see a very long response time in case of conditional keep-alive requests to some origin servers. To fix this issue, simply set max_os_ka_req to 0, at global or website level: 
Most origin web servers are configured to only use a Keep-Alive connection for a few seconds, before discarding it and requiring opening of a new connection . The reasons for this include logic “resets” – to catch and stop any memory leaks in the application code and reducing number of processes and open connections that operating system needs to maintain on origin server. You might want to fine-tune aiScaler by setting max_os_ka_time parameter in global section match origin web server setting.
Please note that these OS Keep-Alive settings can be specified at global or website level, latter overriding global values.  max_os_ka_req MMM
max_os_ka_req 0  You can see judge the effectiveness of Origin Server Keep-Alive connections by observing average number of OS requests per OS connection – it is reported both in Global and Website sections of Web self-refreshing monitoring pages. The higher the number, the faster it is to obtain a response from Origin Server, as it reduces (amortizes) overhead of TCP connection handshake. Again, the savings are most noticeable when aiScaler and Origin Servers are located in different Datacenters.


US 1 (408) 744-6078   EU +44 20 7993 4587