Fastsocket is a highly scalable socket and its underlying networking implementation of Linux kernel. With the straight linear scalability, Fastsocket can provide extremely good performance in multicore machines. In addition, it is very easy to use and maintain. As a result, it has been deployed in the production environment of SINA.
With a rapid growth of NIC bandwidth and CPU cores on one single machine, a scalable TCP network stack is performance-critical. However, stock Linux kernel does not scale well when CPU core number is above 4. It is even worse that the throughput could collapse when there are more than 12 CPU cores.
Fastsocket is a scalable kernel TCP socket implementation and achieves a straight linear performance growth when scaling up to 24 CPU cores. Meanwhile, The underlying kernel optimization of Fastsocket is transparent for socket applications, which means existing applications can take advantage of Fastsocket without changing their codes.
Currently Fastsocket is implemented in the Linux kernel(kernel-2.6.32-431.29.2.el6.x86_64) of CentOS-6.5 which is the latest version of redhat EL6, since CentOS-6.5 is our major production environment system. According to our evaluations, Fastsocket increases throughput of Nginx and HAProxy(measured by connections per second) by 290% and 620% on a 24-core machine, compared to the base CentOS-6.5 kernel.
Moreover, Fastsocket can further exploit more from the hardware:
Fastsocket (V1.0) has already been deployed in the SINA production environment. Fastsocket is used with HAProxy to provide HTTP load balance service and has been running stably since March 2014 More details are in the Evaluation.
Fastsocket is released under GPLv2 and we promise that we would never ask for any payment to use our codes.
The source code is available at https://github.com/fastos/fastsocket.git. Clone the repository by:
[[email protected] ~]# git clone https://github.com/fastos/fastsocket.git
Here is a brief introduction to the directories in the repository.
The following commands will build and install the kernel after Fastsocket repository is downloaded from git. You can customize the config file if you are sure you will not miss some important component. Fastsocket can be built smoothly on 64-bit CentOS-6.X systems. Problems may arise on 32-bit systems and CentOS-7 systems.
[[email protected] ~]# cd fastsocket/kernel [[email protected] kernel]# make defconfig [[email protected] kernel]# make [[email protected] kernel]# make modules_install [[email protected] kernel]# make install
Enter the library directory and make the library:
[[email protected] fastsocket]# cd library [[email protected] library]# make
After that, libfsocket.so is created in the same directory.
When the installation is done, remember to modify grub file to switch to the Fastsocket kernel and reboot the system.
After booting into the kernel with Fastsocket, load the Fastsocket module with default parameters:
[[email protected] ~]# modprobe fastsocket
For more detailed information of modules parameters, please refer to Module.
Two ways to check if the module is loaded successfully.
[[email protected] ~]# lsmod | grep fastsocket fastsocket 23145 0
[[email protected] ~]# dmesg | tail Fastsocket: Load Module Fastsocket: Enable Listen Spawn[Mode-2] Fastsocket: Enable Recieve Flow Deliver Fastsocket: Enable Fast Epoll
Run nic.sh provided in the scripts directory of the repository to take care of remaining configuration.
[[email protected] ~]# cd fastsocket [[email protected] fastsocket]# scripts/nic.sh -i eth0
eth0 is the interface to be used and should be changed according to your system configuration. The script will automatically check system and NIC parameters, then configures various features.
If you are interested in how nic.sh works, please refer to Scripts.
Generally, scenarios meeting the following conditions will benefit the most from Fastsocket (V1.0):
Meanwhile, we are developing Fastsocket to improve the network stack performance in more general scenarios. You can refer to New Features.
Fastsocket is enabled by preloading a shared library named libfsocket.so when launching an application. For example, ngnix can be started with Fastsocket by:
[[email protected] fastsocket]# cd library [[email protected] library]# LD_PRELOAD=./libfsocket.so nginx
Without the preloaded library, applications can run as if they are on the original kernel, which provides a super quick rollback in case there is a need.
[[email protected] ~]# nginx
For more information about the library, please refer to Library.
Here we list a few applications that are working fine with Fastsocket:
We are also using Fastsocket on the load generators in our benchmark tests. This is very helpful since Fastsocket greatly increases the maximum work load that could be generated from a single machine, which saves machines and operations. These load generators are:
We provide a demo server in the demo directory of the repository. The demo server does nothing but read/write messages from/to network sockets and is purely used to study and benchmark the performance of network stack of Linux kernel. When the demo server is running, it has little user CPU consumption, which makes it a perfect network application to observe the network stack performance.
Moreover, it is also used to demonstrate the scalability and performance improvement of Fastsocket over the base Linux kernel.
For more information about the demo server, please refer to Demo.
Some important configurations:
Note: YOU'D BETTER DO DISABLE accept_mutex! With default Fastsocket module parameters, Fastsocket has partioned listen socket, therefore, there is no need to force user to accept connections one by one. If some cpu had no chance to receive packet especially the tcp syn packet by RPS or others, the nginx would fail to accept the new conn forever with accept_mutex enabled. So you should make sure the new request could be passed to every CPU, if you want load blance the accept with accept mutex of nginx.
From the figure below, Fastsocket on 24 CPU cores achieves 475K connection per second (cps), with a speed up of 21X. The throughput of base CentOS-6.5 kernel increases non-linearly up to 12 CPU cores and drops dramatically to 159K cps with 24 CPU cores. The latest 3.13 kernel doubles the throughput to 283K cps when using 24 CPU cores compared with the base CentOS-6.5 kernel. However, it has not completely solved the scalability bottlenecks, preventing it from scaling beyond 12 CPU cores.
Some important configurations:
As shown in the same figure, Fastsocket presents an excellent scalability performance, which is very similar to the previous Nginx case. Fastsocket outperforms Linux 3.13 by 139K cps and base CentOS-6.5 kernel by 370K cps when using 24 CPU cores, though the one core throughputs are very close among all the three kernels.
As mentioned before, Fastsocket has already been deployed in the SINA production environment. One typical scenario is using Fastsocket with HAProxy to provide HTTP load balance service to WEIBO and other SINA productions.
In the figure below, it is the CPU utilization of an 8-core servers within 24 hours. Figure (a) shows the CPU utilization before deploying Fastsocket and figure (b) shows the CPU utilization after deploying Fastsocket.
We can see from the figure, what happened after Fastsocket is used:
Moreover, since the server is an old 8-core machine, we expect Fastsocket would make more performance improvement when Fastsocket is deployed on a machine with more CPU cores (It is already observed on a 12-core machine after updating Fastsocket).
We are now improving network stack efficiency in the case of long TCP connection. Four more features are introduced:
We evaluated our new work on redis which is a typical and popular key-value cache application.
Some important configurations:
The 8-redis-instance test shows:
Mailing-list: [email protected]
Google Group: fastsocket-dev (https://groups.google.com/forum/#!forum/fastsocket-dev)
Sending a mail to the address above will subcribe to the mailing list. The subject and message do not matter.