ProxySQL Faster Restart

  • Date:
  • Tags: proxy mysql high availability

tl;dr

ProxySQL 1.4.1 can be restarted in less than 15 milliseconds, with minimal impact on availability.

ProxySQL fast restart

ProxySQL is often being used as the default high availability proxy for MySQL in production, and it is crucial to be highly available itself.
Currently it is possible to have ProxySQL not as a single point of failure in any production deployment, and there are resources available on this topic:

No matter how highly available ProxySQL is, a crash it is always possible due to unknown bugs.
For this reason, ProxySQL always had the ability to auto-restart in case of failure.
But how quickly can be the restart?
Historically, this could take around 1 second before the restart itself is triggered, plus some initialization time. Initialization time varies depending from the number of users in mysql_users, the number of servers in mysql_servers, and if it is running on fast or slow disk.
In a production setup this proved to be a long time, therefore this weekend project was to improve restart time.

To test how quickly proxysql can be restarted in case of failure I wrote a small application that creates one or multiple threads, connects and disconnects at regular interval, and eventually run simple SELECT statements.
The reason why I didn't use sysbench for this is that I want the option to only connect and disconnect (without sending queries), to consider the event completed when the connection completes, and to define the reconnect interval. (But also to get some experience using histograms for future projects).
This helped to identify bottleneck and fix them.

Let’s have fun!
While running the following command, proxysql was being killed (kill -9) from another terminal:

./connect_speed -i 2000 -c 3000 -t 1 -u sbtest -p sbtest -h 127.0.0.1 -D sbtest -P 6033 -q 1 -H 0 -s
Connections[OK/ERR]: (2993,7) . Queries: (2993,0) . Clock time: 6294ms
[0ms, 0.1ms): 5
[0.1ms, 0.2ms): 43
[0.2ms, 0.3ms): 450
[0.3ms, 0.4ms): 387
[0.4ms, 0.5ms): 399
[0.5ms, 0.6ms): 696
[0.6ms, 0.7ms): 646
[0.7ms, 0.8ms): 216
[0.8ms, 0.9ms): 72
[0.9ms, 1ms): 40
[1ms, 1.1ms): 24
[1.1ms, 1.2ms): 9
[1.2ms, 1.3ms): 4
[1.3ms, 1.4ms): 1
[1.4ms, 1.5ms): 3
[1.5ms, 1.6ms): 2
[1.7ms, 1.8ms): 1
[2ms, 4ms): 2

The above means:

  • -t 1 : one thread
  • -u , -p , -h , -D , -P : username, password, hostname, schemaname , port
  • -i 2000 : a connection should be created and destroyed every 2000 microseconds = 2 milliseconds
  • -c 3000 : try to establish up to 3000 connections
  • -q 1 : execute 1 query per connection : this is important to ensure that not only proxysql is able to accept connections, but also to serve requests

Results: 7 connections failed, while 2993 connections were successful, including executing a query to backend. In other words, proxysql was not available for roughly 15 milliseconds.
Also, most of the requests completed between 0.2ms and 0.8ms , but this is not very relevant for this test.

Conclusion

ProxySQL release 1.4.1 will be able to restart in roughly 15 milliseconds in case of crash due to unknown bug. This drastically improves availability and minimize impact.

Note:

This simple application can be used to used also to compare connection time and identify regression in different versions of ProxySQL , MySQL/MariaDB/Percona Server, and other proxies. I strongly recommend others to use it.
I will probably publish some comparison in future.