36 Commits

Author SHA1 Message Date
Sheng Yang
630e75596e CLOUDSTACK-1653: Redundant router: Fix check_heartbeat.sh malfunctional due to delayed cron job
The interval between keepalived.ts and keepalived.ts2 should be >= 60 seconds in
normal condition, because every 10 seconds keepalived.ts would be updated, and
at least every 60 seconds, keepalived.ts would be copy to keepalived.ts2.

If the interval is less than 60 seconds, then keepalived process failed to
update keepalived.ts every 10 seconds.

Take some delay of updating into consideration, check_heartbeat.sh would use 30
seconds as a way to tell keepalived process is alive or not.
2013-03-12 18:31:31 -07:00
Sheng Yang
7602997b90 CLOUDSTACK-1521: Stop services after switch to BACKUP 2013-03-05 15:58:57 -08:00
Sheng Yang
89dee27503 CLOUDSTACK-1414: Reduce arping time to nearly 0
In the previous version it would take at least 1 seconds for arping, and it
would be big if the VR has more than 30 IPs - our biglock default timeout is 30
seconds.

Fix it by send out two arping immediately, and then sleep 1 second for router to
update arp cache.
2013-02-26 12:39:15 -08:00
Sheng Yang
0b60dda1e6 Correct the license information of services.sh 2012-12-21 15:32:36 -08:00
Sheng Yang
5eba489198 Redundant Router: Restart vpn related services when redundant router fail-over 2012-12-21 15:22:03 -08:00
Chip Childers
f212177146 CLOUDSTACK-159 - Added license header 2012-10-01 12:23:57 -04:00
Chip Childers
5f7a8a0436 CLOUDSTACK-162 - Added license header 2012-10-01 12:21:41 -04:00
Sheng Yang
0c6dcb4772 CS-15094: Fix multiply vlan of redundang router
This fix would work because:
1. When booting up the router, there is possible that no ip information have
been set for the interface(CS would do it after confirm router is up), so the
interface isn't associate with any ip, then ifconfig cannot work. We have to use
ifup, this is especially true for the first router become master.

2. After booting up phase, the ip would be associated with interfaces, then we
can use ifconfig to bring them up.
2012-09-26 16:28:33 -07:00
Sheng Yang
435e4f6868 CS-16400: Fix LB service using port 8080
Also added license header for passwd_server_ip

Ported from:

commit 1072ec7ae36911ed794c182a1146025a0e969ea9
Author: Sheng Yang <sheng.yang@citrix.com>
Date:   Wed Sep 12 11:15:33 2012 -0700

    CS-16318: Update the fix with some tweak

    1. The old fix run cloud-passwd-srvr twice because cloud-passwd-srvr is
still in the list of enabled_svcs

    2. The lock should be applied on serve_password.sh, which controlled the
accessing to the password. Applied on the MASTER/BACKUP switch is useless, two
instance of serve_password.sh would still able to access the password file at
the same time.

    3. Password service is a part of redundant router state transition process
now, so if the service failed to start, then the transition failed.

    4. Restart password service should be put before restart dnsmasq, which
would sent out DHCP offer to the user vms. If user VMs got the DHCP offer first
but failed to get password, there would be an issue.

    Reviewed-by: Anthony Xu

commit fa94da114099da357df7daa1aad3c327868393ca
Author: Jayapal Reddy <jayapalreddy.uradi@citrix.com>
Date:   Wed Sep 12 17:57:03 2012 +0530

    Bug:CS-16318 Starting password server on the both IPs in RRVM
    Reviewed-by: Abhi

Conflicts:

	patches/systemvm/debian/config/opt/cloud/bin/passwd_server
2012-09-26 16:28:33 -07:00
Chip Childers
e2730c91d9 Adding license headers and licensing details for patches folder. 2012-09-25 14:26:52 -04:00
Sheng Yang
bbc78bab5d CLOUDSTACK-159: Clean the configuration file
Now it's all written by myself.
2012-09-21 11:47:58 -07:00
Gavin Lee
39a676c496 Correct license header mainly for patches folder
Signed-off-by: Chip Childers <chip.childers@gmail.com>
I've assumed that Gavin's commit is appropriate, based
on an assumption that we will keep these files in the source
tree.  If https://issues.apache.org/jira/browse/LEGAL-146
results in a different opionion from the members, then we
will end up having to do something more drastic anyway.
2012-08-31 10:50:46 -04:00
Sheng Yang
96e7e3d1ca CS-15175: Fix public interfaces of redundant router
We need to use ifup/ifdown to bring up the interfaces, because ifconfig don't
know the ip of the interface after we modify cloud-early-config to avoid
first start up of public interface.

Reviewed-by: Edison
2012-05-31 17:58:02 -07:00
frank
2f634c0913 Switch to Apache license 2012-04-03 04:50:05 -07:00
frank
52610ffcb3 add copyright header to shell scripts 2012-01-11 18:41:53 -08:00
Sheng Yang
f98191be5c Fix domr's file lock
And add more information for domr's file lock
2012-01-10 14:25:43 -08:00
Sheng Yang
7e6bbf9b16 Discard rrouter lock
Then we can make all the actions in sequence
2011-12-30 15:00:59 -08:00
Sheng Yang
14d6c85176 bug 12727: Add arping to update the vSwitch cache
We need to broadcast all our public IP address's ARP, not only the gateway one.

status 12727: resolved fixed
2011-12-22 17:24:57 -08:00
Sheng Yang
3b2e2b079b bug 12704: Fix multiply public nics with redundant router
status 12704: resolved fixed
2011-12-21 16:01:58 -08:00
Sheng Yang
fe838c5528 bug 11233: Update switch's cache using ping
We would ping the gateway after transit to MASTER, this should speed up the
update of switch's cache.
2011-09-14 16:26:54 -07:00
Sheng Yang
ba2fc97865 bug 11351: Add monitor process for keepalived
Then when the process dead, we can know it and prevent two MASTER case happened.
2011-09-14 16:25:17 -07:00
Sheng Yang
d3b0f04877 bug 11351: Add checkrouter.sh.templ
Also modify ipassoc.sh to use checkrouter.sh
2011-09-14 16:25:03 -07:00
Sheng Yang
b007e24e59 bug 11351: Add parameters for binary file/log file 2011-09-14 16:24:50 -07:00
Sheng Yang
4bbfa2513e bug 11307: Add PRIORITY bump up script for redundant virtual routers 2011-09-14 16:18:55 -07:00
Sheng Yang
abc44ac283 bug 11266: Add lock file for every script in the systemVM
To prevent them from racy.

status 11266: resolved fixed
2011-09-09 18:27:33 -07:00
Sheng Yang
29cc88571f Redundant router script fix, also fix CheckRouterTask 2011-08-11 17:57:12 -07:00
Sheng Yang
258a1bc451 Ifdown may not bring interface down if ifup not run
Use ifconfig to bring it down
2011-08-11 15:01:02 -07:00
Sheng Yang
7807e29c30 Use ifup/ifdown for redundant router 2011-08-11 14:30:21 -07:00
Sheng Yang
5cf6feb2e5 Fix "RTNETLINK answers: No such process" when starting redundant router
The issue happened quite rare, but indeed can show.

And when the issue happen, the status of redundant router would be "Status:
FAULT".

It's due to ipassoc.sh wasn't executed before the system bring eth2 up and go to
master mode, then eth2 wasn't configured correctly. Then "ip route add default
xx" can't complete.

This commit should fixes the issue.
2011-08-10 12:06:53 -07:00
Sheng Yang
071a67dcb8 Change router to FAULT state if anything goes wrong on fail-over 2011-08-09 11:09:44 -07:00
Sheng Yang
9985df928b Try to workaround "ip route add" fail in redundant router
It's probably due to the network is not ready, so wait some time for it.
2011-08-05 16:40:57 -07:00
Sheng Yang
dc46ffb0c7 bug 9154: various fix for scripts 2011-06-22 15:30:39 -07:00
Sheng Yang
d71ed00148 bug 9154: Add more log in keepalived.log 2011-06-15 15:39:48 -07:00
Sheng Yang
819e67b189 Add file lock for keepalived scripts
They are not blocked callings.
2011-06-07 14:47:46 -07:00
Sheng Yang
2973ab5ef5 Enable multiply public ips for redundant router
Also solve duplicate mac issue.
2011-06-07 14:47:46 -07:00
Sheng Yang
62ac899091 bug 9154: Initial check in for enabling redundant virtual router
This patch enable redundant virtual routers.

1. To enable this feature, db need to be updated using follow SQL by now(we
would get a UI way later):

UPDATE network_offerings SET redundant_router=1 WHERE guest_type="Virtual" AND
system_only=0;

2. System would try to start up two routers at different hosts. But if there is
only one host in the zone, system would start up two routers on it.

3. The failover part is using keepalived, and connection tracking part is using
conntrackd. There would be one master router and one backup router. The status
of router(master or backup) can be query from the database table domain_router
now. Management server would update the status every 30s by default.

4. The routers for the same zone would use same external NIC(same ip and mac).
The script used for fail-over would ensure only one external NIC present in the
network at any time.

5. Currently management server don't got the ability to stop one of router is
both of them reported as master. The feature is in the todo list.

After two routers start up, disconnect anyone of them, the guest network
shouldn't be affected, and established connection(http, ssh, etc.) should still
works. The fail-over on gateway part should be 3~4 seconds.

Currently the patch works with KVM. Would deal with vmware and XenServer soon.
2011-06-07 14:47:45 -07:00