The Broadband Guide
SG
search advanced

Linux Broadband Tweaks

Raising network limits for high speed, high latency networks under Linux
2003-04-01 (updated: 2021-01-07) by
Tags: , , , ,

The TCP/IP parameters for tweaking a Linux-based machine for fast internet connections are located in /proc/sys/net/... (assuming 2.1+ kernel). This location is volatile, and changes are reset at reboot. There are a couple of methods for reapplying the changes at boot time, ilustrated below.


Locating the TCP/IP parameters

All TCP/IP tunning parameters are located under /proc/sys/net/...  For example, here is a list of the most important tunning parameters, along with short description of their meaning:

/proc/sys/net/core/rmem_max - Maximum TCP Receive Window
/proc/sys/net/core/wmem_max - Maximum TCP Send Window
/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP receive buffers
/proc/sys/net/ipv4/tcp_wmem - memory reserved for TCP send buffers
/proc/sys/net/ipv4/tcp_timestamps - Timestamps (RFC 1323) add 12 bytes to the TCP header...
/proc/sys/net/ipv4/tcp_sack - TCP Selective Acknowledgements. They can reduce retransmissions, however make servers more prone to DDoS Attacks and increase CPU utilization.
/proc/sys/net/ipv4/tcp_window_scaling - support for large TCP Windows (RFC 1323). Needs to be set to 1 if the Max TCP Window is over 65535.

Keep in mind everything under /proc is volatile, so any changes you make are lost after reboot.   There are some additional internal memory buffers for the TCP Window, allocated for each connection:

/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP rcv buffers (reserved memory per connection default)
/proc/sys/net/ipv4/tcp_wmem  - memory reserved for TCP snd buffers (reserved memory per connection default)


The tcp_rmem and tcp_wmem contain arrays of three parameter values: the 3 numbers represent minimum, default and maximum memory values. Those 3 values are used to bound autotunning and balance memory usage while under global memory stress.


Applying TCP/IP Parameters at System Boot

TCP/IP parameters in Linux are located in /proc/sys/net/ipv4 and /proc/sys/net/core . This is part of the Virtual filesystem which resides in system memory (RAM), and any changes to it are volatile, they are reset when the machine is rebooted.

There are two methods that we can use to apply the settings at each reboot. First, we can edit /etc/sysctl.conf (or /etc/sysctl.d/sysctl.conf, depending on your distribution). The syntax for setting parameters in this file is by issuing sysctl commands, as follows::

net.core.rmem_default = 256960
net.core.rmem_max = 256960
net.core.wmem_default = 256960
net.core.wmem_max = 256960
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_window_scaling = 1

You can see a list of all tweakable parameters by executing the following in your terminal: sysctl -a | grep tcp  (or simply sysctl -a for a full list).

Alternatively, you can apply the settings at boot time by editing the /etc/rc.local, /etc/rc.d/rc.local, or /etc/boot.local depending on your distribution. Note the difference in syntax, you simply echo the appropriate value in the virtual file system. The TCP/IP parameters should be self-explanatory: we're basically setting the TCP Window to 256960, disabling timestamps (to avoid 12 byte header overhead), enabling tcp window scaling, and selective acknowledgements:

echo 256960 > /proc/sys/net/core/rmem_default
echo 256960 > /proc/sys/net/core/rmem_max
echo 256960 > /proc/sys/net/core/wmem_default
echo 256960 > /proc/sys/net/core/wmem_max
echo 0 > /proc/sys/net/ipv4/tcp_timestamps
echo 0 > /proc/sys/net/ipv4/tcp_sack
echo 1 > /proc/sys/net/ipv4/tcp_window_scaling

You can change the above example values as desired, depending on your internet connection and maximum bandwidth/latency. There are other parameters you can change from the default if you're confident in what you're doing - just find the correct syntax of the values in /proc/sys/net/... and add a line in the above code analogous to the others. To revert to the default parameters, you can just comment or delete the above code from /etc/rc.local and restart.

Note: To manually set the MTU value under Linux, use the command: ifconfig eth0 mtu 1500   (where 1500 is the desired MTU size)


Changing Current Values

While testing, the current TCP/IP parameters can be edited without the need for reboot in the following locations:

/proc/sys/net/core/
rmem_default = Default Receive Window
rmem_max = Maximum Receive Window
wmem_default = Default Send Window
wmem_max = Maximum Send Window

/proc/sys/net/ipv4/
You'll find timestamps, window scaling, selective acknowledgements, etc.

Keep in mind the values in /proc will be reset upon reboot. You still need to add the code in either sysctl.conf, or  the alternate syntax in rc.local in order to have the changes applied at each boot as described in the section above.

To make any new sysctl.conf changes take effect without rebooting the machine, you can execute:

sysctl -p

To see a list of all relevant tweakable sysctl parameters, along with their current values, try the following in your terminal:

sysctl -a | grep tcp

To set a single sysctl value:

sysctl -w variable=value
example:  sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=30


TCP Parameters to consider

TCP_FIN_TIMEOUT
This setting determines the time that must elapse before TCP/IP can release a closed connection and reuse its resources. During this TIME_WAIT state, reopening the connection to the client costs less than establishing a new connection. By reducing the value of this entry, TCP/IP can release closed connections faster, making more resources available for new connections. Adjust this in the presence of many connections sitting in the TIME_WAIT state:

sysctl.conf syntax:
net.ipv4.tcp_fin_timeout = 15

(default: 60 seconds, recommended 15-30 seconds)

alternative rc.local syntax:
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout


TCP_KEEPALIVE_INTERVAL
This determines the wait time between isAlive interval probes. To set:

sysctl.conf syntax:
net.ipv4.tcp_keepalive_intvl = 30

(default: 75 seconds, recommended: 15-30 seconds)

alternative rc.local syntax:
echo 30 > /proc/sys/net/ipv4/tcp_keepalive_intvl


TCP_KEEPALIVE_PROBES
This determines the number of probes before timing out. To set:

sysctl.conf syntax:
net.ipv4.tcp_keepalive_probes = 5

(default: 9, recommended 5)

alternative rc.local syntax:
echo 5 > /proc/sys/net/ipv4/tcp_keepalive_probes


TCP_TW_RECYCLE
It enables fast recycling of TIME_WAIT sockets. The default value is 0 (disabled). The sysctl documentation incorrectly states the default as enabled. It can be changed to 1 (enabled) in many cases. Known to cause some issues with hoststated (load balancing and fail over) if enabled, should be used with caution.

sysctl.conf syntax:
net.ipv4.tcp_tw_recycle=1

(boolean, default: 0)

alternative rc.local syntax:
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle


TCP_TW_REUSE
This allows reusing sockets in TIME_WAIT state for new connections when it is safe from protocol viewpoint. Default value is 0 (disabled). It is generally a safer alternative to tcp_tw_recycle

sysctl.conf syntax:
net.ipv4.tcp_tw_reuse=1

(boolean, default: 0)

alternative rc.local syntax:
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse

Note: The tcp_tw_reuse setting is particularly useful in environments where numerous short connections are open and left in TIME_WAIT state, such as web servers. Reusing the sockets can be very effective in reducing server load.


Linux Netfilter Tweaks

Try this for a list netfilter parameters:  sysctl -a | grep netfilter

We can add the following commands to the /etc/sysctl.conf file to tune individual parameters, as follows.
To reduce the number of connections in TIME_WAIT state, we can decrease the number of seconds connections are kept in this state before being dropped:

# reduce TIME_WAIT from the 120s default to 30-60s
net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
# reduce FIN_WAIT from teh 120s default to 30-60s
net.netfilter.nf_conntrack_tcp_timeout_fin_wait=30

You can commit the sysctl.conf changes without rebooting (and test for possible syntax errors) by executing: sysctl -p
To check sysctl parameters, use: sysctl -a

Misc Notes: You may want to reduce net.netfilter.nf_conntrack_tcp_timeout_established to 900 or some manageable number as well.
To check the actual number of current connections in the TIME_WAIT state, for example, try: netstat -n | grep TIME_WAIT | wc -l


Kernel Recompile Option

There is another method one can use to directly set the default TCP/IP parameters, involving kernel recompile... If you're brave enough. Look for the parameters in the following files:
/LINUX-SOURCE-DIR/include/linux/skbuff.h  (Look for SK_WMEM_MAX & SK_RMEM_MAX)
/LINUX-SOURCE-DIR/include/net/tcp.h (Look for MAX_WINDOW & MIN_WINDOW)


Determine Connection States

It is often useful to decrease some of the TCP Timeouts to release resources faster and reduce memory use, the default TCP timeouts may leave too many connections in the TIME_WAIT state. To see a list of all current connections to the machine and their states, try:

netstat -tan | grep ':80 ' | awk '{print $6}' | sort | uniq -c

You will be presented with a list similar to the following:

  4 CLOSING
12 ESTABLISHED
  4 FIN_WAIT1
14 FIN_WAIT2
12 LAST_ACK
  1 LISTEN
10 SYN_RECV
273 TIME_WAIT

This information can be very useful to determine whether you need to tweak some of the timeouts above.


SYN Flood Protection

These settings added to sysctl.conf will make a server more resistant to SYN flood attacks. Applying configures the kernel to use the SYN cookies mechanism, with a backlog queue of 1024 connections, also setting the SYN and SYN/ACK retries to an effective ceiling of about 45 seconds. The defaults for these settings vary depending on kernel version and distribution you may want to check them with sysctl -a | grep syn

net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_syn_retries = 6
net.ipv4.tcp_synack_retries = 3
net.ipv4.tcp_syncookies = 1

Notes: The default SYN retries cycle under Linux doubles every time, so 5 retries means: the original packet, 3s, 6s, 12s, 24s.. 6th retry is 48s. Under BSD-derived kernels (including Mac OS X), the retry times triple instead.



TCP Congestion Control Algorithm

By default, the Linux kernel uses CUBIC congestion control/avoidance algorithm, which is very good, and probably the most popular. There is a maybe better new alternative: BBR (version 2). You can read more about it in our Congestion Control Algorithms Comparison article. Note that BBR is very new, and its implementation is still being tweaked. It was first introduced in kernel 4.9, but there was a patch/improvement to its functionality as recently as November 2020 in kernel 5.9.11

To see your current kernel version, in terminal type:  uname -r
To view a list of available congestion control algorithms: sysctl net.ipv4.tcp_available_congestion_control
To see your current congestion control algorithm:  sysctl net.ipv4.tcp_congestion_control

Note that only CUBIC and RENO are available by default. To change to BBR with kernels 4.9 and newer, you need to edit /etc/sysctl.conf and add the following two lines to the bottom of it:

net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr

After that, reolad sysctl with the command:  sysctl -p

You should now see the new congestion control algorithm in use (with the command: sysctl net.ipv4.tcp_congestion_control).



References

TCP Variables
See also the complete ip-sysctl parameters reference -here-

  User Reviews/Comments:
    rate:
   avg:
by anonymous - 2008-03-09 18:44
Two things:

1) This threw me at first: Changing the referenced files in a root kate seemed to make no difference in the files; they stayed the same as far as I could tell. But the command cat /proc/sys/net/core/wmem_max (or whichever file) showed the changes. This might have only been an oddity in my setup: Mandriva 2008.

2) This must not work for kernels 2.17 and newer. See http://www.psc.edu/networking/projects/tcptune/#Linux . The article says these kernels auto-tune download speeds. I made the recommended changes but could detect no difference in speeds. Then I rebooted without any of the changes in place, used the cat command, and found the files had been set to 131071. However, the TCP/IP analyzer says it's set at 5888. This I don't understand. Also, I discovered the files' settings change infrequently by themselves.

In faith, Dave
by Breno Leitao - 2008-04-12 13:16
Well, the /proc/sys/net/core/rmem_max isn't the maximum *TCP* receive window size. I'd say that is the maximum receive size for a generic protocol.
Even, this value limits the tcp_rmem[max], if they're different.

--
Thanks
Breno
by darkglobe87 - 2008-08-12 19:03
Newbie alert!
I'm quite comfortable tweaking the win9x and 2000/xp reg to fine-tune my tcp/ip peformance, but have recently switched to linux (xubuntu 8.04). Now, this guide may well mean something to more experienced users of linux, but... well it doesn't mean much at all to me.

I think this article should be simplified somewhat.

Is there an easier way to tune internet settings under linux??
by jsday187 - 2009-05-15 11:14
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Lazy Alert!!! Someone is capable of installing XUbuntu but not capable of copying and pasting a few commands into the console.

Suggestion ?

Go back to Windows Vista Home Premium!
by Vamsi - 2009-05-20 07:00
I have a setup with a client doing a TCP download from a server(Both are Linux Fedora Core 10).I changed the rmem_max and wmem_max from 64KBytes to 256KBytes.My purpose is to check the throughput at different TCP Window size. But unfortunately I get same at all sizes.

Let me know the place where I can configure the TCP Window size like in Iperf tool.
by Philip - 2009-05-20 08:38
Here are a couple of pointers:

1. To check the throughput difference there has to be substantial latency on the line, you will not be able to notice much on a LAN.

2. Linux has auto-tuning of the TCP Window. Even though you will get some benefit from tweaking the maximum TCP Window, you may only see the current (instead of maximum) TCP Receive Window by examining packets or using the TCP Analyzer.
by anonymous - 2009-08-13 06:20
The best article I've ever read about tunning tcp/ip connections.
by Sami Kerola - 2009-09-09 07:27
Bellow is my version of socket state counter. It's basically same as in original article, but a bit more generic and perhaps a one microsecond quicker because of less commands piping to each others.

# netstat -an | awk '/^tcp/ {A[$(NF)]++} END {for (I in A) {printf "%5d %s\n", A[I], I}}'
1962 TIME_WAIT
18 FIN_WAIT1
1 FIN_WAIT2
7 ESTABLISHED
12 SYN_RECV
8 LISTEN
by anonymous - 2011-07-28 02:58
This is an awesome guide. :) the best over the internet.
by biru - 2011-09-11 11:45
i changed rwin value like the guide suggested it

biru@biru-EasyNote-MH35:~$ sudo gedit /etc/rc.local
[sudo] password for biru:
biru@biru-EasyNote-MH35:~$ sudo sysctl -p
net.core.rmem_default = 256960
net.core.rmem_max = 256960
net.core.wmem_default = 256960
net.core.wmem_max = 256960
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1


but testing my pc, i got this results

MTU = 1500
MTU is fully optimized for broadband.
MSS = 1460
Maximum useful data in each packet = 1460, which equals MSS.
Default TCP Receive Window (RWIN) = 14600

why??
by Philip - 2011-09-16 18:33
Linux (and new versions of Windows) have TCP Auto-tuning, which tunes the TCP Window value on the fly and the Analyzer can only detect the currently advertised RWIN value (during the TCP handshake). The advertised RWIN will keep expanding during large transfers.
by andrs chanda - 2012-12-07 12:53
Hi there, I've tried your advice but as yours is focused on internet connection I think that's why does not solve my problem, which is next:

I have a little office network, where I have a file, printer, download and router server. I mean I use it as storage (NFS). By a web interface users can access to a mldonkey where they look for things they need to download. And all the network goes to the internet through it.
I recently increased the memory fom 1G to 3G, and added a 2TB hard disk for storage. Since this moment on some workstations, specially the old ones have problems accesing the files stored at the server, if a user needs to watch a video, while watching it, the video stops some times and after a while starts again. The acces to folders with huge amount of data makes the nautilus window to turn gray for a while before showing the content, etc….
These kind of things didn’t happen with 1Gb RAM. I have to say that I have read that large disks may affect performance of the machine. But I suspect my problem is related to ram addition and the network.
I hope you can help me.
Thanks a lot.
by Angel Genchev - 2015-05-26 15:49
Beware: The sack is better turned OFF for servers. It would help only over a high bandwidth, but lossy (or high delay) link.
The bad: It allows of certain type of DOS attack.
See http://web.archive.org/web/20111022225944/http://www.ibm.com/developerworks/linux/library/l-tcp-sack/index.html
by B.E.E. - 2019-03-04 17:24
Ebonicly speaking and asking;
"Couldn't one of the many geniuses commenting on this topic create a one click fix for this?" I will be patiently waiting and praying that this comes to pass. Lol.
News Glossary of Terms FAQs Polls Cool Links SpeedGuide Teams SG Premium Services SG Gear Store
Registry Tweaks Broadband Tools Downloads/Patches Broadband Hardware SG Ports Database Security Default Passwords User Stories
Broadband Routers Wireless Firewalls / VPNs Software Hardware User Reviews
Broadband Security Editorials General User Articles Quick Reference
Broadband Forums General Discussions
Advertising Awards Link to us Server Statistics Helping SG About