188 Commits

Author SHA1 Message Date
Rohit Yadav
aa8a721c39 CLOUDSTACK-9838: Allow ingress traffic between guest VMs via snat IPs
This enables the firewall/mangle tables rules to ACCEPT instead of RETURN, which
is the same behaviour as observed in ACS 4.5. By accepting the traffic, guest
VMs will be able to communicate tcp traffic between each other over snat public
IPs.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-07-22 10:53:21 +02:00
Rajani Karuturi
24434beb42 Revert "Merge pull request #2084 from shapeblue/passwd-speedup"
This reverts commit 48f413a9825d0554cf5080b4723688d8c47afe5c, reversing
changes made to 5f35c15b6b3ff49cb49c5563abbef7cc0e21d4a7.
2017-06-26 09:58:33 +05:30
Ronald van Zantvoort
c10c3245d1 vRouter: prevent fh leakage and use buffered writes in DataBags 2017-06-07 18:11:33 +02:00
Rajani Karuturi
48f413a982 Merge pull request #2084 from shapeblue/passwd-speedup
Passwd speedup
2017-06-06 15:32:02 +05:30
Daan Hoogland
710d3bff3f rat 2017-05-08 07:42:04 +02:00
Remi Bergsma
9a21f56f8a Speedup vm start by making vm_passwd saving much faster
- do not keep passwords in databag (/etc/cloudstack/vmpasswd.json)
- process only the password we get in (vm_password.json) from mgt server
- lookup the correct passwd server instead of adding passwd to all of them

Example:
- 4 tiers and 199 VMs running
- Start vm 200 would cause new passwd from vm_password.json (1) to be merged with /etc/cloudstack/vmpasswd.json (199)
- A curl command was exected foreach password (200) foreach tier (4) resulting in 800 calls
- In fact, since passwds are never cleaned it could very well be even more as the ip address was the key in the json file so until the ip address was reused the original password would remain and be sent to passwd server every time another vm starts.
- This took ~40 seconds

Now we just figure out the right tier and only process the new password resulting in a single curl call.
- takes 0,03 seconds!
2017-05-06 21:48:25 +02:00
Daan Hoogland
0db9c980a6 ignore bogus default gateway
when a shared network is secondary the default gateway gets overwritten by a bogus one
  dnsmasq does the right thing and replaces it with its own default which is not good for us
  so check for '0.0.0.0'
2017-04-20 09:36:17 +02:00
Wei Zhou
8c69cb1c1f CLOUDSTACK-9770: fix missing ip routes in VR 2017-02-03 17:51:46 +01:00
Wei Zhou
066b374c35 CLOUDSTACK-9692: Fix password server issue in redundant VRs
The password server in RVRs has wrong parameters as the gateway of guest nics is None.
In this case, we should get the gateway from /var/cache/cloud/cmdline.
2016-12-30 09:35:00 +01:00
Murali Reddy
6749785cab CLOUDSTACK-9339 Virtual Routers don't handle Multiple Public Interfaces correctly
-when processing static nat rule, add a mangle table rule, to mark the traffic
   from the guest vm when it has associated static nat rule so that traffic gets
   routed using the route tabe of the device which has public ip associated

  -fix the case where nic_device_id is empty when ip is getting disassociated
   resulting in empty deviceid in ips.json

  -add utility methods in CsRule, and CsRoute to add 'ip rule' and 'ip route' rules respectivley

  -ensure traffic from all public interfaces are connection marked with device number, and restored
   for the reverse traffic. use the connection marked number to do device specific routing table lookup
   fill the device specific routing table with default route

  -component tests for testing multiple public interfaces of VR
2016-12-07 14:33:24 +05:30
Rohit Yadav
cc72e4da64 systemvm: Fix regression from 825935
Fixes merge conflict issue incorrectly fixed during a fwd-merge in 825935
from PR #1766

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-11-27 15:43:10 +05:30
Rohit Yadav
825935da69
Merge branch '4.8' into 4.9 2016-11-24 12:44:19 +05:30
Rohit Yadav
90ae04b791
Merge pull request #1766 from murali-reddy/vr-default-network-gateway
CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network when guest VM has nic's in more than one guest network set the tag for each host in /etc/dhcphosts.txt, and use the tag to add exception in /etc/dhcpopts.txt to prevent sending default route, dns server in case if the nic is in non-default network

this was the behaviour with edithosts.sh prior to 4.6

* pr/1766:
  CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-11-24 12:39:36 +05:30
Rohit Yadav
1e88ad45a7
Merge pull request #1680 from murali-reddy/vr_csfile_search
CLOUDSTACK-9498: VR CsFile search utility methods fail when search stThere is no real use of python 're' module  in CsFile.py utility methods searchString, deleteLine. Regular string search is sufficient. These methods are used only for VPN user add/delete. Since VPN user password can have python 're' module meta characters, it interfere with search functionality.

Replacing re.search() with regular string search instead.

Change is confined to VPN add/delete users. Have run the test/integration/component/test_vpn_users.py

VPN remote access user limit tests ... === TestName: test_01_VPN_user_limit | Status : SUCCESS ===
ok
Test create VPN when L2TP port in use ... === TestName: test_02_use_vpn_port | Status : SUCCESS ===
ok
Test create NAT rule when VPN when L2TP enabled ... === TestName: test_03_enable_vpn_use_port | Status : SUCCESS ===
ok
Test add new users to existing VPN ... === TestName: test_04_add_new_users | Status : SUCCESS ===
ok
Test add duplicate user to existing VPN ... === TestName: test_05_add_duplicate_user | Status : SUCCESS ===
ok
Test as global admin, add a new VPN user to an existing VPN entry ... === TestName: test_06_add_VPN_user_global_admin | Status : SUCCESS ===
ok
Test as domain admin, add a new VPN user to an existing VPN entry ... === TestName: test_07_add_VPN_user_domain_admin | Status : SUCCESS ===
ok

* pr/1680:
  CLOUDSTACK-9498: VR CsFile search utility methods fail when search string has 're' meta chars, and causing VPN user add/deelte to fail

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-11-23 14:31:00 +05:30
Rohit Yadav
1137a79ccc
Merge branch '4.8' into 4.9 2016-11-23 13:38:11 +05:30
Murali Reddy
7ab35e6616 CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network
when guest VM has nic's in more than one guest network set the tag for
each host in /etc/dhcphosts.txt, and use the tag to add exception in
/etc/dhcpopts.txt to prevent sending default route, dns server in case if the nic is in non-default network
this was the behaviour with edithosts.sh prior to 4.6

added new test case test_router_dhcp_opts to test DHCP option file use of cloudstack
2016-11-22 16:30:42 +05:30
Murali Reddy
4c4696e5e4 CLOUDSTACK-9583: VR: In CsDhcp.py preseed both hostaname and localhost to resolve to 127.0.0.1
The VR executes a ip route flush command as part of configurations. This command performs a
DNS lookup on the VR hostname. Since the VR does not have a DNS entry, the ip command would
wait 5 seconds before timing out and executing the flush operation. This fix adds the VR
hostname to /etc/hosts mapped to 127.0.0.1 to answer the DNS lookup – reducing the
execution time.
2016-11-10 13:25:22 +05:30
Wido den Hollander
fa56d0b3e6
CLOUDSTACK-8326: Always fill UDP checksums in DHCP replies in VR
In some cases the UDP checksums in packets from DHCP servers are
incorrect. This is a problem for some DHCP clients that ignore
packets with bad checksums. This patch inserts an iptables rule
to ensure DHCP servers always send packets with correct checksums.

Due to this bug DHCP offers are sometimes not accepted by Instances.

The end-result without this fix is no connectivity for the Instance
due to the lack of a IPv4 address.

This is also commited in OpenStack:
- https://github.com/projectcalico/felix/issues/40
- https://review.openstack.org/148718
- https://bugzilla.redhat.com/show_bug.cgi?id=910619

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-11-08 18:41:37 +01:00
Murali Reddy
9cc06a8fc8 CLOUDSTACK-9498: VR CsFile search utility methods fail when search string has
're' meta chars, and causing VPN user add/deelte to fail

    -there is no real use of python 're' in CsFile.py utility methods searchString, deleteLine
    Replacing with regular string search instead.

    -modifying the smoke test for VPN user add/delete to have all permissable chars
2016-10-28 17:45:15 +05:30
Rohit Yadav
14504dc7e3 CLOUDSTACK-6432: Prevent DNS reflection attacks
DNS on VR should not be publically accessible as it may be prone to DNS
amplification/reflection attacks. This fixes the issue by only allowing VR
DNS (port 53) to be accessible from guest network cidr, as per the fix in:
https://issues.apache.org/jira/browse/CLOUDSTACK-6432

- Only allows guest network cidrs to query VR DNS on port 53.
- Includes marvin smoke test that checks the VR DNS accessibility checks from
  guest and non-guest network.
- Fixes Marvin sshClient to avoid using ssh agent when password is provided,
  previous some environments may have seen 'No existing session' exception without
  this fix.
- Adds a new dnspython dependency that is used to perform dns resolutions in the
  tests.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-08-30 22:39:33 +05:30
Patrick Dube
9ab676206a Added missing rules on router config, fixed ordering of multiple rules, removed duplicate rules, added fix for network stats, added a check for b64 decoding (to pad incorrect b64). Also added a catch exception to be logged on the configure main. 2016-07-22 15:32:20 -04:00
Patrick Dube
6dd6ef0c9a Added fix for adding/editing Network ACL rule ordering 2016-07-11 15:12:41 -04:00
Ronald van Zantvoort
91a8faac31 SysVM various fixes to previous refactorings
* make CORS include a regular glob-matched one
* fix NameVirtualHost in CsApp.py as well
* even moar cleanups
2016-06-07 13:03:10 +02:00
Ronald van Zantvoort
e32cd1303a VR: consistent SSL setup, vhost is not an example, but a template 2016-06-07 13:03:10 +02:00
Ronald van Zantvoort
f379df4bc2 VR CsConfig: reintroduce old get_dns() behaviour for redundant non-VPC's 2016-06-07 13:03:10 +02:00
Ronald van Zantvoort
d14a484374 VR CsAddress fixes:
* cleanup imports,
* fix to_str(),
* improve & fix service post_config logic
* don't arpPing when there's no gateway
2016-06-07 13:03:09 +02:00
Ronald van Zantvoort
6055ed6ed6 VR CsApp: Expose config to classes, move vhost confs to proper location, allow for multiple IP's per intf, sanitize servername, don't open port 53 if no DNS is foreseen 2016-06-07 13:03:09 +02:00
Ronald van Zantvoort
748bf43530 VR CsConfig: Add is_router(), is_dns(), has_dns(), has_metadata(), use_extdns(), fix get_dns() with use_extdns() 2016-06-07 13:03:09 +02:00
Ronald van Zantvoort
875379042e VR CsDhcp: allow multiple ranges & finite lease time (fixes CLOUDSTACK-8303) 2016-06-07 13:03:09 +02:00
Ronald van Zantvoort
2790d7a69b VR CsGuestNetwork obey useextdns 2016-06-07 13:03:09 +02:00
Will Stevens
d9429f6add Merge pull request #1471 from remibergsma/47_lower_interface_wait
Lower the time we wait for interfaces to appearWaiting for interfaces is tricky. They might never appear.. for example when we have entries in `/etc/cloudstack/ips.json` that haven't been plugged yet. Waiting this long makes everything horribly slow (every vm, interface, static route, etc, etc, will hit this wait, for every device). We've seen CloudStack send an `ip_assoc.json` command for `eth1` public nic only and then the router goes crazy waiting for all other interfaces that were there before reboot and aren't there. If only the router would return to the mgt server a success of `eth1`, it would get the command for `eth2` etc etc. Obviously, a destroy works much faster because no state services, so no knowledge of previous devices so no waits :-)

After a stop/start the router has state in `/etc/cloudstack/ips.json` and every commands waits. Eventually hitting the hardcoded 120 sec timeout.

* pr/1471:
  lower the time we wait for interfaces to appear

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-26 15:49:50 -04:00
Will Stevens
5ccebf0f2b Merge pull request #1514 from dsclose/CLOUDSTACK-6975
CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvRRebase of PR #1509 against the 4.7 branch as requested by @swill

One LGTM from @ustcweizhou carried from previous PR. Previous PR will be closed.

Description from PR #1509:

CLOUDSTACK-6975 refers to service monitoring bringing up dnsmasq but this is no-longer accurate, as service monitoring is not active on the post-4.6 routers. These routers still suffer an essentially identical issue, however, because "dnsmasq needs to be restarted each time configure.py is called in order to avoid lease problems." As such, dnsmasq is still running on backup RvRs, causing the issues described in CLOUDSTACK-6975.

This PR is based on a patch submitted by @ustcweizhou. The code now checks the redundant state of the router before restarting dnsmasq.

RvR networks without this patch have dnsmasq running on both master and backup routers. RvR networks with this patch have dnsmasq running on only the master router.

* pr/1514:
  CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvR.

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-25 22:52:55 -04:00
Remi Bergsma
74f60df828 Revert "Merge pull request #1482 from remibergsma/iptables-fix"
Seems to have a license issue so reverting for now.

This reverts commit 9a20ab8bcbbd39aa012a0ec5a65e66bcc737ee0e, reversing
changes made to 7a0b37a29a8be14011427dcf61bf3ea86e47dbf4.
2016-05-19 11:04:46 +02:00
Will Stevens
ebc70a51e2 Merge pull request #1486 from remibergsma/reimplement-vrrp-setting-47
Reimplement router.redundant.vrrp.interval settingGlobal setting `router.redundant.vrrp.interval` is not used any more and it is now set to a hardcoded 1.

This results in a failover from master->backup when the backup doesn't hear from the master in ~3.6sec. This is a bit too tight, as we've seen failovers during live migrations. We could reproduce it in about half of the cases. Setting this to setting to 2 (tested it by hardcoding it in the systemvms) gives twice as much time and we didn't see issues any more. Instead of updating the hardcoded setting from 1 to 2, I reimplemented the global setting by sending it to the router with the cmd_line, as the non-VPC router also does.

Background:
Why is the maximum failover time in the example 3.6 seconds? This comes from the advertisement interval and the skew time. The default advertisement interval is 1 second (configurable in keepalived.conf). The skew time helps to keep everyone from trying to transition at once. It is a number between 0 and 1, based on the formula (256 - priority) / 256

As defined in the RFC, the backup must receive an advertisement from the master every (3 * advert_int) + skew_time seconds. If it doesn't hear anything from the master, it takes over. With a backup router priority of 100 (as in the example), the failover will happen at most 3.6 seconds after the master goes down.

Source: http://www.hollenback.net/KeepalivedForNetworkReliability

* pr/1486:
  Configure rVPC for router.redundant.vrrp.interval advert_int setting
  Have rVPCs use the router.redundant.vrrp.interval setting

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-18 15:52:38 -04:00
Will Stevens
9a20ab8bcb Merge pull request #1482 from remibergsma/iptables-fix
Restore iptables at once using iptables-restore instead of calling iptables numerous timesThis makes handling the firewall rules about 50-60 times faster because it is generated in memory and then loaded once. It's work by @borisroman see PR #1400. Reopened it here because I think this is a great improvement.

* pr/1482:
  Resolve conflict as forceencap is already in master
  Split the cidr lists so we won't hit the iptables-resture limits
  Check the existence of 'forceencap' parameter before use
  Do not load previous firewall rules as we replace everyhing anyway
  Wait for dnsmasq to finish restart
  Remove duplicate spaces, and thus duplicate rules.
  Restore iptables at once using iptables-restore instead of calling iptables numerous times
  Add iptables copnversion script.

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-18 15:50:20 -04:00
Remi Bergsma
9c0eee4387 Configure rVPC for router.redundant.vrrp.interval advert_int setting 2016-05-13 14:37:04 +02:00
Will Stevens
bbb2dd034e Merge pull request #1536 from ntavares/useextdns_rvmvip47
Honour GS use_ext_dns and redundant VR VIPThis patch addresses two issues:

On redundant VR setups, the primary resolver being handed out to instances is the guest_ip (primary IP for the VR). This might lead to problems upon failover, at least while the DHCP lease doesn't update (because the primary resolver will be checked first until times out, however it'll be gone upon failover).

If Global Setting use_ext_dns is true, we don't want the VR to be the primary resolver at all.

* pr/1536:
  This patch addresses two issues:

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-12 18:23:01 -04:00
Will Stevens
3fab75772f Merge pull request #1474 from remibergsma/47_private_gw_initial_config
Handle private gateways more reliablyWhen initialising a VPC router we need to know which IP/device corresponds to a private gateway. This is to solve a problem when stop/starting a VPC router (which gets the private gateway config as a guest network and as a result breaks the functionality). You read it right, the private gateway is sent as type=guest after reboot and type=public initially.

Before this change, you could add a private gw to a running router but you couldn't restart it (it would mix up the tiers). Now the private gateway is detected properly and it works just fine.

Booting without private gateway:
```
root@r-167-VM:~# cat /etc/cloudstack/cmdline.json
{
    "config": {
        "baremetalnotificationapikey": "V2l1u3wKJVan01h8kq63-5Y5Ia3VLEW1v_Z6i-31QIRJXlt5vkqaqf6DVcdK0jP3u79SW6X9pqJSLSwQP2c2Rw",
        "baremetalnotificationsecuritykey": "OXI16srCrxFBi-xOtEwcYqwLlMfSFTlTg66YHtXBBqR7HNN1us3HP5zWOKxfVmz4a3C1kUNLPrUH13gNmZlu4w",
        "disable_rp_filter": "true",
        "dns1": "8.8.8.8",
        "domain": "cs2cloud",
        "eth0ip": "169.254.0.42",
        "eth0mask": "255.255.0.0",
        "host": "192.168.22.61",
        "name": "r-167-VM",
        "port": "8080",
        "privategateway": "None",
        "redundant_router": "false",
        "template": "domP",
        "type": "vpcrouter",
        "vpccidr": "10.0.0.0/24"
    },
    "id": "cmdline"
```

Booting with private gateway:
```
root@r-167-VM:~# cat /etc/cloudstack/cmdline.json
{
    "config": {
        "baremetalnotificationapikey": "V2l1u3wKJVan01h8kq63-5Y5Ia3VLEW1v_Z6i-31QIRJXlt5vkqaqf6DVcdK0jP3u79SW6X9pqJSLSwQP2c2Rw",
        "baremetalnotificationsecuritykey": "OXI16srCrxFBi-xOtEwcYqwLlMfSFTlTg66YHtXBBqR7HNN1us3HP5zWOKxfVmz4a3C1kUNLPrUH13gNmZlu4w",
        "disable_rp_filter": "true",
        "dns1": "8.8.8.8",
        "domain": "cs2cloud",
        "eth0ip": "169.254.2.227",
        "eth0mask": "255.255.0.0",
        "host": "192.168.22.61",
        "name": "r-167-VM",
        "port": "8080",
        "privategateway": "10.201.10.1",
        "redundant_router": "false",
        "template": "domP",
        "type": "vpcrouter",
        "vpccidr": "10.0.0.0/24"
    },
    "id": "cmdline"
```

And:
```
cat cmdline
vpccidr=10.0.0.0/24 domain=cs2cloud dns1=8.8.8.8 privategateway=10.201.10.1 template=domP name=r-167-VM eth0ip=169.254.2.227 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true baremetalnotificationsecuritykey=OXI16srCrxFBi-xOtEwcYqwLlMfSFTlTg66YHtXBBqR7HNN1us3HP5zWOKxfVmz4a3C1kUNLPrUH13gNmZlu4w baremetalnotificationapikey=V2l1u3wKJVan01h8kq63-5Y5Ia3VLEW1v_Z6i-31QIRJXlt5vkqaqf6DVcdK0jP3u79SW6X9pqJSLSwQP2c2Rw host=192.168.22.61 port=8080
```

Logs:
```
2016-02-24 20:08:45,723 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Work-Job-Executor-4:ctx-458d4c52 job-1402/job-1403 ctx-d5355fca) (logid:5772906c) Set privategateway field in cmd_line.json to 10.201.10.1
```

* pr/1474:
  Handle private gateways more reliably
  Add private gateway IP to router initialization config

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-12 11:02:55 -04:00
Will Stevens
919660d093 Merge pull request #1472 from remibergsma/47_fix_static_router_master_change
Apply static routes on change to master stateRefactored static routes for private gateways so they also get loaded when the router switches to master state. Otherwise they're lost and connections drop after fail over.

* pr/1472:
  apply static routes on change to master state

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-12 11:01:55 -04:00
dean.close
38b3bdd488 CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvR. 2016-05-09 11:34:47 +01:00
Nuno Tavares
c269097a27 This patch addresses two issues:
On redundant VR setups, the primary resolver being handed out to instances is the guest_ip (primary IP for the VR). This might lead to problems upon failover, at least while the DHCP lease doesn't update (because the primary resolver will be checked first until times out, however it'll be gone upon failover).

If Global Setting use_ext_dns is true, we don't want the VR to be the primary resolver at all.
2016-05-08 22:47:55 +02:00
Remi Bergsma
f4f9b3ab4e Handle private gateways more reliably 2016-04-10 20:06:44 +02:00
Wilder Rodrigues
78bbd498e7 CLOUDSTACK-9287 - Fix RVR public interface 2016-04-09 21:14:41 +02:00
Wilder Rodrigues
c41edc1fe6 CLOUDSTACK-9287 - Refactor the interface state configuration
- This also refactors the CsAddress in order to offer better readability in a couple of methods.
2016-04-09 21:14:25 +02:00
Remi Bergsma
6a767732f9 CLOUDSTACK-9287 - Bring up the private gw interface on state change to master 2016-04-09 21:14:10 +02:00
Remi Bergsma
057b54aa3e CLOUDSTACK-9287 - Make sure private gw interface is not used for default gw 2016-04-09 21:13:47 +02:00
Wilder Rodrigues
d93b008deb CLOUDSTACK-9287 - Put private gateway interface down on backup router 2016-04-09 21:13:35 +02:00
Remi Bergsma
b9feb39e17 apply static routes on change to master state 2016-04-07 20:57:58 +02:00
Remi Bergsma
3636ad1114 lower the time we wait for interfaces to appear
They might never appear.. for example when we have entries in
/etc/cloudstack/ips.json that haven't been plugged yet. Waiting
this long makes everything horribly slow (every vm, interface,
static route, etc, etc, will hit this wait, for every device).
2016-04-07 20:52:33 +02:00
Boris Schrijver
eb9706b655 Wait for dnsmasq to finish restart 2016-02-05 12:02:58 +01:00