CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network when guest VM has nic's in more than one guest network set the tag for each host in /etc/dhcphosts.txt, and use the tag to add exception in /etc/dhcpopts.txt to prevent sending default route, dns server in case if the nic is in non-default network
this was the behaviour with edithosts.sh prior to 4.6
* pr/1766:
CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9498: VR CsFile search utility methods fail when search stThere is no real use of python 're' module in CsFile.py utility methods searchString, deleteLine. Regular string search is sufficient. These methods are used only for VPN user add/delete. Since VPN user password can have python 're' module meta characters, it interfere with search functionality.
Replacing re.search() with regular string search instead.
Change is confined to VPN add/delete users. Have run the test/integration/component/test_vpn_users.py
VPN remote access user limit tests ... === TestName: test_01_VPN_user_limit | Status : SUCCESS ===
ok
Test create VPN when L2TP port in use ... === TestName: test_02_use_vpn_port | Status : SUCCESS ===
ok
Test create NAT rule when VPN when L2TP enabled ... === TestName: test_03_enable_vpn_use_port | Status : SUCCESS ===
ok
Test add new users to existing VPN ... === TestName: test_04_add_new_users | Status : SUCCESS ===
ok
Test add duplicate user to existing VPN ... === TestName: test_05_add_duplicate_user | Status : SUCCESS ===
ok
Test as global admin, add a new VPN user to an existing VPN entry ... === TestName: test_06_add_VPN_user_global_admin | Status : SUCCESS ===
ok
Test as domain admin, add a new VPN user to an existing VPN entry ... === TestName: test_07_add_VPN_user_domain_admin | Status : SUCCESS ===
ok
* pr/1680:
CLOUDSTACK-9498: VR CsFile search utility methods fail when search string has 're' meta chars, and causing VPN user add/deelte to fail
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
when guest VM has nic's in more than one guest network set the tag for
each host in /etc/dhcphosts.txt, and use the tag to add exception in
/etc/dhcpopts.txt to prevent sending default route, dns server in case if the nic is in non-default network
this was the behaviour with edithosts.sh prior to 4.6
added new test case test_router_dhcp_opts to test DHCP option file use of cloudstack
The VR executes a ip route flush command as part of configurations. This command performs a
DNS lookup on the VR hostname. Since the VR does not have a DNS entry, the ip command would
wait 5 seconds before timing out and executing the flush operation. This fix adds the VR
hostname to /etc/hosts mapped to 127.0.0.1 to answer the DNS lookup – reducing the
execution time.
CLOUDSTACK-8326: Always fill UDP checksums in DHCP replies in VRIn some cases the UDP checksums in packets from DHCP servers are
incorrect. This is a problem for some DHCP clients that ignore
packets with bad checksums. This patch inserts an iptables rule
to ensure DHCP servers always send packets with correct checksums.
Due to this bug DHCP offers are sometimes not accepted by Instances.
The end-result without this fix is no connectivity for the Instance
due to the lack of a IPv4 address.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* pr/1743:
CLOUDSTACK-8326: Always fill UDP checksums in DHCP replies in VR
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In some cases the UDP checksums in packets from DHCP servers are
incorrect. This is a problem for some DHCP clients that ignore
packets with bad checksums. This patch inserts an iptables rule
to ensure DHCP servers always send packets with correct checksums.
Due to this bug DHCP offers are sometimes not accepted by Instances.
The end-result without this fix is no connectivity for the Instance
due to the lack of a IPv4 address.
This is also commited in OpenStack:
- https://github.com/projectcalico/felix/issues/40
- https://review.openstack.org/148718
- https://bugzilla.redhat.com/show_bug.cgi?id=910619
Signed-off-by: Wido den Hollander <wido@widodh.nl>
CLOUDSTACK-9183: bash: /opt/cloud/bin/getRouterAlerts.sh: No such file or directory
* pr/1744:
CLOUDSTACK-9183: bash: /opt/cloud/bin/getRouterAlerts.sh: No such file or directory
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
're' meta chars, and causing VPN user add/deelte to fail
-there is no real use of python 're' in CsFile.py utility methods searchString, deleteLine
Replacing with regular string search instead.
-modifying the smoke test for VPN user add/delete to have all permissable chars
resulting in internal LB vm not come up
parsing cmd_line to create 'ips' data bag, never handled internal lb vm, but still
worked due to another bug. support for internal lb vm is added with this fix
CLOUDSTACK-9480, CLOUDSTACK-9495 fix egress rule incorrect behaviorWhen 'default egress policy' is set to 'allow' in the network offering, any egress rule that is added will 'deny' the traffic overriding the default behaviour.
Conversely, when 'default egress policy' is set to 'deny' in the network offering, any egress rule that is added will 'allow' the traffic overriding the default behaviour.
While this works for 'tcp', 'udp' as expected, for 'icmp' protocol its always set to ALLOW. This patch keeps all protocols behaviour consistent.
Results of running test/integration/component/test_egress_fw_rules.py. With out the patch test_02_egress_fr2 test was failing. This patch fixes the test_02_egress_fr2 scenario.
-----------------------------------------------------------------------------------------------------
Test By-default the communication from guest n/w to public n/w is NOT allowed. ... === TestName: test_01_1_egress_fr1 | Status : SUCCESS ===
ok
Test By-default the communication from guest n/w to public n/w is allowed. ... === TestName: test_01_egress_fr1 | Status : SUCCESS ===
ok
Test Allow Communication using Egress rule with CIDR + Port Range + Protocol. ... === TestName: test_02_1_egress_fr2 | Status : SUCCESS ===
ok
Test Allow Communication using Egress rule with CIDR + Port Range + Protocol. ... === TestName: test_02_egress_fr2 | Status : SUCCESS ===
ok
Test Communication blocked with network that is other than specified ... === TestName: test_03_1_egress_fr3 | Status : SUCCESS ===
ok
Test Communication blocked with network that is other than specified ... === TestName: test_03_egress_fr3 | Status : SUCCESS ===
ok
Test Create Egress rule and check the Firewall_Rules DB table ... === TestName: test_04_1_egress_fr4 | Status : SUCCESS ===
ok
Test Create Egress rule and check the Firewall_Rules DB table ... === TestName: test_04_egress_fr4 | Status : SUCCESS ===
ok
Test Create Egress rule and check the IP tables ... SKIP: Skip
Test Create Egress rule and check the IP tables ... SKIP: Skip
Test Create Egress rule without CIDR ... === TestName: test_06_1_egress_fr6 | Status : SUCCESS ===
ok
Test Create Egress rule without CIDR ... === TestName: test_06_egress_fr6 | Status : SUCCESS ===
ok
Test Create Egress rule without End Port ... === TestName: test_07_1_egress_fr7 | Status : EXCEPTION ===
ERROR
Test Create Egress rule without End Port ... === TestName: test_07_egress_fr7 | Status : SUCCESS ===
ok
Test Port Forwarding and Egress Conflict ... SKIP: Skip
Test Port Forwarding and Egress Conflict ... SKIP: Skip
Test Delete Egress rule ... === TestName: test_09_1_egress_fr9 | Status : SUCCESS ===
ok
Test Delete Egress rule ... === TestName: test_09_egress_fr9 | Status : SUCCESS ===
ok
Test Invalid CIDR and Invalid Port ranges ... === TestName: test_10_1_egress_fr10 | Status : SUCCESS ===
ok
Test Invalid CIDR and Invalid Port ranges ... === TestName: test_10_egress_fr10 | Status : SUCCESS ===
ok
Test Regression on Firewall + PF + LB + SNAT ... === TestName: test_11_1_egress_fr11 | Status : SUCCESS ===
ok
Test Regression on Firewall + PF + LB + SNAT ... === TestName: test_11_egress_fr11 | Status : SUCCESS ===
ok
Test Reboot Router ... === TestName: test_12_1_egress_fr12 | Status : SUCCESS ===
ok
Test Reboot Router ... === TestName: test_12_egress_fr12 | Status : EXCEPTION ===
ERROR
Test Redundant Router : Master failover ... === TestName: test_13_1_egress_fr13 | Status : SUCCESS ===
ok
Test Redundant Router : Master failover ... === TestName: test_13_egress_fr13 | Status : SUCCESS ===
ok
-----------------------------------------------------------------------------------------------------
* pr/1666:
fix egress rule incorrect behavior
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
CLOUDSTACK-9480: Egress Firewall: Incorrect use of Allow/Deny for ICMP
fix ensures, ICMP, TCP, UDP are handled similalry w.r.t egress rule action
CLOUDSTACK-9495: Egress rules functionalty broken when protocol=all specified
when protocol=all specified, CIDR was ignored. Fix ensures if CIDR is specified
its always used in configuring iptable rules
2 new test cased to test /32 CIDR
DNS on VR should not be publically accessible as it may be prone to DNS
amplification/reflection attacks. This fixes the issue by only allowing VR
DNS (port 53) to be accessible from guest network cidr, as per the fix in:
https://issues.apache.org/jira/browse/CLOUDSTACK-6432
- Only allows guest network cidrs to query VR DNS on port 53.
- Includes marvin smoke test that checks the VR DNS accessibility checks from
guest and non-guest network.
- Fixes Marvin sshClient to avoid using ssh agent when password is provided,
previous some environments may have seen 'No existing session' exception without
this fix.
- Adds a new dnspython dependency that is used to perform dns resolutions in the
tests.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Ensure that FW_EGRESS_RULE chain exists after upgrading the router
- Flush allow all egress rule on 0.0.0.0/0, if such a rule exists in the config
it will be added later (CLOUDSTACK-9437)
[CLOUDSTACK-9430] Added fix for adding/editing Network ACL rule orderingBUG: https://issues.apache.org/jira/browse/CLOUDSTACK-9430
The issue occurred because all of the ACL rules get inserted before the old ones. Then, the cleanup deletes the duplicate rows, and leaves any new rule in front of the old ones.
Here is an example with a simplified iptables view for ACL
Ex: adding a rule 4
before add:
1,2,3
during add:
1',2',3',4',1,2,3
after add:
4',1,2,3
After fix:
before add:
1,2,3
during add:
1,2,3,1',2',3',4'
after add:
1',2',3',4'
* pr/1609:
Added fix for adding/editing Network ACL rule ordering
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9342: Site to Site VPN PFS not being set correctlyBug in code set PFS to the same value (yes/no) as DPD.
file.addeq(" pfs=%s" % CsHelper.bool_to_yn(obj['dpd']))
* pr/1480:
CLOUDSTACK-9342: Site to Site VPN PFS not being set correctly
Signed-off-by: Will Stevens <williamstevens@gmail.com>
* Take setup from vhost.template rather than default(-ssl)
* should move into Python CS code as well
* Move CORS setup to separate conf
* Modify vhost template to Optionally include the cors file
* Add NameVirtualHost to vhost template for feature parity with ports.conf
* Take setup from vhost.template rather than default(-ssl)
[CLOUDSTACK-9296] Start ipsec for client VPNThis fix starts the IPSEC daemon when enabling client side vpn
* pr/1423:
[CLOUDSTACK-9296] Start ipsec for client VPN
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Lower the time we wait for interfaces to appearWaiting for interfaces is tricky. They might never appear.. for example when we have entries in `/etc/cloudstack/ips.json` that haven't been plugged yet. Waiting this long makes everything horribly slow (every vm, interface, static route, etc, etc, will hit this wait, for every device). We've seen CloudStack send an `ip_assoc.json` command for `eth1` public nic only and then the router goes crazy waiting for all other interfaces that were there before reboot and aren't there. If only the router would return to the mgt server a success of `eth1`, it would get the command for `eth2` etc etc. Obviously, a destroy works much faster because no state services, so no knowledge of previous devices so no waits :-)
After a stop/start the router has state in `/etc/cloudstack/ips.json` and every commands waits. Eventually hitting the hardcoded 120 sec timeout.
* pr/1471:
lower the time we wait for interfaces to appear
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvRRebase of PR #1509 against the 4.7 branch as requested by @swill
One LGTM from @ustcweizhou carried from previous PR. Previous PR will be closed.
Description from PR #1509:
CLOUDSTACK-6975 refers to service monitoring bringing up dnsmasq but this is no-longer accurate, as service monitoring is not active on the post-4.6 routers. These routers still suffer an essentially identical issue, however, because "dnsmasq needs to be restarted each time configure.py is called in order to avoid lease problems." As such, dnsmasq is still running on backup RvRs, causing the issues described in CLOUDSTACK-6975.
This PR is based on a patch submitted by @ustcweizhou. The code now checks the redundant state of the router before restarting dnsmasq.
RvR networks without this patch have dnsmasq running on both master and backup routers. RvR networks with this patch have dnsmasq running on only the master router.
* pr/1514:
CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvR.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
SystemVM cleanupsfrom the logrotate docs
> size - With this, the log file is rotated when the specified size is reached. Size may be specified in bytes (default), kilobytes (sizek), or megabytes (sizem).
> Note: If size and time interval options are specified at same time, only size option take effect. it causes log files to be rotated without regard for the last rotation time. If both log size and timestamp of a log file need to be considered by logrotate, the minsize option should be used. logrotate will rotate log file when they grow bigger than minsize, but not before the additionally specified time interval.
* pr/1414:
systemvm, logrotate: remove daily explicitly as it is ignored
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Seems to have a license issue so reverting for now.
This reverts commit 9a20ab8bcbbd39aa012a0ec5a65e66bcc737ee0e, reversing
changes made to 7a0b37a29a8be14011427dcf61bf3ea86e47dbf4.
Reimplement router.redundant.vrrp.interval settingGlobal setting `router.redundant.vrrp.interval` is not used any more and it is now set to a hardcoded 1.
This results in a failover from master->backup when the backup doesn't hear from the master in ~3.6sec. This is a bit too tight, as we've seen failovers during live migrations. We could reproduce it in about half of the cases. Setting this to setting to 2 (tested it by hardcoding it in the systemvms) gives twice as much time and we didn't see issues any more. Instead of updating the hardcoded setting from 1 to 2, I reimplemented the global setting by sending it to the router with the cmd_line, as the non-VPC router also does.
Background:
Why is the maximum failover time in the example 3.6 seconds? This comes from the advertisement interval and the skew time. The default advertisement interval is 1 second (configurable in keepalived.conf). The skew time helps to keep everyone from trying to transition at once. It is a number between 0 and 1, based on the formula (256 - priority) / 256
As defined in the RFC, the backup must receive an advertisement from the master every (3 * advert_int) + skew_time seconds. If it doesn't hear anything from the master, it takes over. With a backup router priority of 100 (as in the example), the failover will happen at most 3.6 seconds after the master goes down.
Source: http://www.hollenback.net/KeepalivedForNetworkReliability
* pr/1486:
Configure rVPC for router.redundant.vrrp.interval advert_int setting
Have rVPCs use the router.redundant.vrrp.interval setting
Signed-off-by: Will Stevens <williamstevens@gmail.com>