mirror of
https://github.com/apache/cloudstack.git
synced 2025-10-26 08:42:29 +01:00
Merge pull request #1486 from remibergsma/reimplement-vrrp-setting-47
Reimplement router.redundant.vrrp.interval settingGlobal setting `router.redundant.vrrp.interval` is not used any more and it is now set to a hardcoded 1. This results in a failover from master->backup when the backup doesn't hear from the master in ~3.6sec. This is a bit too tight, as we've seen failovers during live migrations. We could reproduce it in about half of the cases. Setting this to setting to 2 (tested it by hardcoding it in the systemvms) gives twice as much time and we didn't see issues any more. Instead of updating the hardcoded setting from 1 to 2, I reimplemented the global setting by sending it to the router with the cmd_line, as the non-VPC router also does. Background: Why is the maximum failover time in the example 3.6 seconds? This comes from the advertisement interval and the skew time. The default advertisement interval is 1 second (configurable in keepalived.conf). The skew time helps to keep everyone from trying to transition at once. It is a number between 0 and 1, based on the formula (256 - priority) / 256 As defined in the RFC, the backup must receive an advertisement from the master every (3 * advert_int) + skew_time seconds. If it doesn't hear anything from the master, it takes over. With a backup router priority of 100 (as in the example), the failover will happen at most 3.6 seconds after the master goes down. Source: http://www.hollenback.net/KeepalivedForNetworkReliability * pr/1486: Configure rVPC for router.redundant.vrrp.interval advert_int setting Have rVPCs use the router.redundant.vrrp.interval setting Signed-off-by: Will Stevens <williamstevens@gmail.com>
This commit is contained in:
commit
ebc70a51e2
@ -1598,6 +1598,9 @@ Configurable, StateListener<VirtualMachine.State, VirtualMachine.Event, VirtualM
|
||||
if (isRedundant) {
|
||||
buf.append(" redundant_router=1");
|
||||
|
||||
final int advertInt = NumbersUtil.parseInt(_configDao.getValue(Config.RedundantRouterVrrpInterval.key()), 1);
|
||||
buf.append(" advert_int=").append(advertInt);
|
||||
|
||||
final Long vpcId = router.getVpcId();
|
||||
final List<DomainRouterVO> routers;
|
||||
if (vpcId != null) {
|
||||
|
||||
@ -154,3 +154,7 @@ class CsCmdLine(CsDataBag):
|
||||
return self.idata()['useextdns']
|
||||
return False
|
||||
|
||||
def get_advert_int(self):
|
||||
if 'advert_int' in self.idata():
|
||||
return self.idata()['advert_int']
|
||||
return 1
|
||||
|
||||
@ -113,6 +113,7 @@ class CsFile:
|
||||
self.new_config[sind:eind] = content
|
||||
|
||||
def greplace(self, search, replace):
|
||||
logging.debug("Searching for %s and replacing with %s" % (search, replace))
|
||||
self.new_config = [w.replace(search, replace) for w in self.new_config]
|
||||
|
||||
def search(self, search, replace):
|
||||
|
||||
@ -138,6 +138,9 @@ class CsRedundant(object):
|
||||
" router_id ", " router_id %s" % self.cl.get_name())
|
||||
keepalived_conf.search(
|
||||
" interface ", " interface %s" % guest.get_device())
|
||||
keepalived_conf.search(
|
||||
" advert_int ", " advert_int %s" % self.cl.get_advert_int())
|
||||
|
||||
keepalived_conf.greplace("[RROUTER_BIN_PATH]", self.CS_ROUTER_DIR)
|
||||
keepalived_conf.section("authentication {", "}", [
|
||||
" auth_type AH \n", " auth_pass %s\n" % self.cl.get_router_password()])
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user