From: Tomislav Čohar Date: Tue, 26 Aug 2014 22:25:12 +0000 (+0200) Subject: Configure minimum reconnect timeouts. X-Git-Tag: release-1.0.25~7 X-Git-Url: https://git.librecmc.org/?a=commitdiff_plain;h=ac11a79ba7d56e8c770b3dd4c503b9243c4ea4e3;p=oweals%2Ftinc.git Configure minimum reconnect timeouts. Enable the configuration of minimum reconnect timeout via a configuration directive "MinTimeout". This functionality is missing in the default tinc stable distribution. The minimum timeout is, in code, set to 0 seconds. This patch makes it configurable. You might ask yourself why is that needed at all ? Well, we've been using tinc with success for quite some time in a cross DC setup. Tinc is used to create a virtual network switch and to connect our distributed database nodes into a virtual local network. Our database nodes exchange information, synchronize and do failover over the created tinc-backed network. Every now and then, when a node has a physical networking issue and is unreachable by some or all neighboring nodes, tinc will relay traffic over reachable neighboring nodes and thus save our cluster. But, sometimes, especially when BGP route changes take place, minor outages of physical connectivity towards some nodes may cause tinc to become as reliable as packet-loss is :). Tinc is fast, it can and does re-establish a lost connection in a jiffy, but it cannot detect the reason for the loss of the connection. A re-established connection might last for a few seconds (ping timeout) to get lost again just because the packet loss is huge at that time. Then it reconnects again and the story repeats itself. This process keeps repeating until the physical network stabilizes. Packet loss on a physical link means disaster in a database replication scenario. In such cases it is better for tinc to remain disconnected from the unreachable/destabilized nodes for some time and relay traffic over the reachable (unaffected) nodes then to use an unreliable route. This patch enables us to slow down the re-connection process and eliminate application level issues we had. --- diff --git a/src/conf.h b/src/conf.h index 3a040c7..59c081c 100644 --- a/src/conf.h +++ b/src/conf.h @@ -38,6 +38,7 @@ extern avl_tree_t *config_tree; extern int pinginterval; extern int pingtimeout; extern int maxtimeout; +extern int mintimeout; extern bool bypass_security; extern char *confbase; extern char *netname; diff --git a/src/net_setup.c b/src/net_setup.c index b117443..a00a321 100644 --- a/src/net_setup.c +++ b/src/net_setup.c @@ -553,6 +553,18 @@ static bool setup_myself(void) { } else maxtimeout = 900; + if(get_config_int(lookup_config(config_tree, "MinTimeout"), &mintimeout)) { + if(mintimeout < 0) { + logger(LOG_ERR, "Bogus minimum timeout!"); + return false; + } + if(mintimeout > maxtimeout) { + logger(LOG_WARNING, "Minimum timeout (%d s) cannot be larger than maximum timeout (%d s). Correcting !", mintimeout, maxtimeout ); + mintimeout=maxtimeout; + } + } else + mintimeout = 0; + if(get_config_int(lookup_config(config_tree, "UDPRcvBuf"), &udp_rcvbuf)) { if(udp_rcvbuf <= 0) { logger(LOG_ERR, "UDPRcvBuf cannot be negative!"); diff --git a/src/net_socket.c b/src/net_socket.c index 9a67bb3..948ce01 100644 --- a/src/net_socket.c +++ b/src/net_socket.c @@ -40,6 +40,7 @@ #endif int addressfamily = AF_UNSPEC; +int mintimeout = 0; int maxtimeout = 900; int seconds_till_retry = 5; int udp_rcvbuf = 0; @@ -273,6 +274,9 @@ int setup_vpn_in_socket(const sockaddr_t *sa) { void retry_outgoing(outgoing_t *outgoing) { outgoing->timeout += 5; + if(outgoing->timeout < mintimeout) + outgoing->timeout = mintimeout; + if(outgoing->timeout > maxtimeout) outgoing->timeout = maxtimeout;