VMware router will be rebooted based on #2794, per current config
the VRs on reboot will go through fsck checks slowing down the deployment
process by few seconds. This will ensure that fsck checks are done
on every 3rd boot of the VR. The `4` is used because 1st boot is done
during the build of systemvmtemplate appliance.
Add upgrade path for a new 4.11.2 systemvmtemplate.
Other changes:
- Add support for XS 7.5 Fixes#2834.
- Reboot VR only if mgmt gw is not pingable on vmware.
- Enable passive ftp by enabling nf_conntrack_helper. This is change in behaviour since linux 4.7
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Arping command in virtual-router was called anyway on python code.
on file: merge.py
line 239, in this code : "dp['gateway'] ='None' ''
later on CsAddress.py line 303
if 'gateway' in self.address:
self.arpPing()
This string 'None' makes if steatement always be true
the solution on #2806 makes dp['gateway'] =''
Cannot be None type because there is a string operation later on code.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This re-introduces the rebooting of VR after setup of nics/macs in
case of VMware. It also adds a minor enhancement to show the console
esp. for root admins when VRs and systemvms are in starting state.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Previously, the ethernet device index was used as rt_table index and
packet marking id/integer. With eth0 that is sometimes used as link-local
interface, the rt_table index `0` would fail as `0` is already defined
as a catchall (unspecified). The fwmarking on packets on eth0 with 0x0
would also fail. This fixes the routing issues, by adding 100 to the
ethernet device index so the value is a non-zero, for example then the
relationship between rt_table index and ethernet would be like:
100 -> Table_eth0 -> eth0 -> fwmark 100 or 0x64
101 -> Table_eth1 -> eth1 -> fwmark 101 or 0x65
102 -> Table_eth2 -> eth2 -> fwmark 102 or 0x66
This would maintain the legacy design of routing based on packet mark
and appropriate routing table rules per table/ids. This also fixes a
minor NPE issue around listing of snapshots.
This also backports fixes to smoketests from master.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.
Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
Using Source NAT option on Private Gateway does not work
This fixes#2680
## Description
<!--- Describe your changes in detail -->
When you use the Source NAT feature of Private Gateways on a VPC. This should Source NAT all traffic from CloudStack VMs going towards IPs reachable through Private Gateways.
This change in this PR, stops adding the Source CIDR to SNAT rules. This should be discussed/reviewed, but i can see no reason why the Source CIDR is needed. There can only be one SNAT IP per interface, except for Static (one-to-one) NATs, which still work with this change in place. The outbound interface is what matters in the rule.
<!-- For new features, provide link to FS, dev ML discussion etc. -->
<!-- In case of bug fix, the expected and actual behaviours, steps to reproduce. -->
##### SUMMARY
<!-- Explain the problem/feature briefly -->
There is a bug in the Private Gateway functionality, when Source NAT is enabled for the Private Gateway. When the SNAT is added to iptables, it has the source CIDR of the private gateway subnet. Since no VMs live in that private gateway subnet, the SNAT doesn’t work.
##### STEPS TO REPRODUCE
<!--
For bugs, show exactly how to reproduce the problem, using a minimal test-case. Use Screenshots if accurate.
For new features, show how the feature would be used.
-->
<!-- Paste example playbooks or commands between quotes below -->
Below is an example:
- VMs have IP addresses in the 10.0.0.0/24 subnet.
- The Private Gateway address is 10.101.141.2/30
In the outputs below, the SOURCE field for the new SNAT (eth3) only matches if the source is 10.101.141.0/30. Since the VM has an IP address in 10.0.0.0/24, the VMs don’t get SNAT’d as they should when talking across the private gateway. The SOURCE should be set to ANYWHERE.
##### BEFORE ADDING PRIVATE GATEWAY
~~~
Chain POSTROUTING (policy ACCEPT 1 packets, 52 bytes)
pkts bytes target prot opt in out source destination
2 736 SNAT all -- any eth2 10.0.0.0/24 anywhere to:10.0.0.1
16 1039 SNAT all -- any eth1 anywhere anywhere to:46.99.52.18
~~~
<!-- You can also paste gist.github.com links for larger files -->
##### EXPECTED RESULTS
<!-- What did you expect to happen when running the steps above? -->
~~~
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 SNAT all -- any eth3 anywhere anywhere to:10.101.141.2
2 736 SNAT all -- any eth2 anywhere anywhere to:10.0.0.1
23 1515 SNAT all -- any eth1 anywhere anywhere to:46.99.52.18
~~~
##### ACTUAL RESULTS
<!-- What actually happened? -->
<!-- Paste verbatim command output between quotes below -->
~~~
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 SNAT all -- any eth3 10.101.141.0/30 anywhere to:10.101.141.2
2 736 SNAT all -- any eth2 10.0.0.0/24 anywhere to:10.0.0.1
23 1515 SNAT all -- any eth1 anywhere anywhere to:46.99.52.18
~~~
## Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [X] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
## GitHub Issue/PRs
<!-- If this PR is to fix an issue or another PR on GH, uncomment the section and provide the id of issue/PR -->
<!-- When "Fixes: #<id>" is specified, the issue/PR will automatically be closed when this PR gets merged -->
<!-- For addressing multiple issues/PRs, use multiple "Fixes: #<id>" -->
Fixes: #2680
## Screenshots (if appropriate):
## How Has This Been Tested?
<!-- Please describe in detail how you tested your changes. -->
<!-- Include details of your testing environment, and the tests you ran to -->
<!-- see how your change affects other areas of the code, etc. -->
## Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [x] I have read the [CONTRIBUTING](https://github.com/apache/cloudstack/blob/master/CONTRIBUTING.md) document.
- [x] My code follows the code style of this project.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
Testing
- [ ] I have added tests to cover my changes.
- [ ] All relevant new and existing integration tests have passed.
- [ ] A full integration testsuite with all test that can run on my environment has passed.
This ensures that password server runs on the dhcpserver identifier
IP which is the not the VRRP virtual (10.1.1.1) IP by default but
the actual ip of the interface. When dhcp client discovery is made,
the `dhcp-server-identifier` contains the non VIP address that is
used by password reset script to query guest VM password.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This introduces a rolling restart of VRs when networks are restarted
with cleanup option for isolated and VPC networks. A make redundant option is
shown for isolated networks now in UI.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes the issue that post-upgrade egress rules are not applied
on VR, restarting the network with cleanup used to be the workaround.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes incorrect xenstore-read binary path, this failed systemvm
to be patched/started correctly on xenserver. The other fix is to keep
the xen-domU flag that may be returned by virt-what. This effect
won't change the cmdline being consumed as the mgmt server side (java)
code sets the boot args in both xenstore and as pv args. The systemvm's
/boot is ext2 that can be booted by PyGrub on both old and recent
XenServer versions.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
On patching, the global cacerts keystore is imported in 'cloud' service
specific local keystore. This fixes#2541.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This adds support for XenServer 7.3 and 7.4, and XCP-ng 7.4 version as hypervisor hosts. Fixes#2523.
This also fixes the issue of 4.11 VRs stuck in starting for up-to 10mins, before they come up online.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes routing table rule setup regression to correctly router
marked packets based on interface related ip route tables. This thereby
fixes the access of VMs in the same VPC using NAT/SNAT public IPs.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
While debugging the VR for #2579, I noticed that one of the scripts were breaking. The variable RROUTER was not set and this broke a conditional.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* dependencies update
* Add extra blank line required by ...!?
* fix W605 invalid escape sequence and more blank lines
* print all installed python packages versions
* systemvm: turn off apache2 server tokens and signature
This turns off apache2 server version signature/token in headers.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* systemvm: remove invalid code as conf.d is not available now
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes rsyslog: fix config error in rsylslog.conf
Feb 26 08:09:54 r-413-VM liblogging-stdlog[19754]: action '*' treated as ':omusrmsg:*' - please use ':omusrmsg:*' syntax instead, '*' will not be supported in the future [v8.24.0 try http://www.rsyslog.com/e/2184 ]
Feb 26 08:09:54 r-413-VM liblogging-stdlog[19754]: error during parsing file /etc/rsyslog.conf, on or before line 95: warnings occured in file '/etc/rsyslog.conf' around line 95 [v8.24.0 try http://www.rsyslog.com/e/2207 ]
- Run apache2 only after cloud-postinit
- Increase /run size for VR with 256M RAM
root@r-395-VM:~# systemctl daemon-reload
Failed to reload daemon: Refusing to reload, not enough space available on /run/systemd. Currently, 15.8M are free, but a safety buffer of 16.0M is enforced.
tmpfs 23M 6.5M 16M 29% /run
This fixes a difference issue in rVR heartbeat check script raised
recently on dev@.
Reduce logging to avoid logging to fill ramdisk
Make checkrouter return fault state when keepalived is not running
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This deprecates and remove TLS 1.0 and 1.1 from preferred list of
protocols and keeps only TLSv1.2.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10317: Fix SNAT rules for additional public nics
This allows networks with additional public nics to have correct
SNAT iptables rules applied on configuration.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* update based on Wei's suggested change
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This might (and does block) in certain situations on the VR as
also explained in the Python documentation:
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.wait
Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE
and the child process generates enough output to a pipe such that
it blocks waiting for the OS pipe buffer to accept more data.
Use communicate() to avoid that.
Using the check_output function handles most of this for us and
also provides better error handling.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Delete dnsmasq's leases file when dnsmasq is restarted to avoid it
use old ip-mac-address-vm mapping leases.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In case of isolated, both self.config.is_vpc() and self.config.is_router() are false,
but self.config.is_dhcp() is true.
Moved the password server logic to the `if has_metadata` block,
as this is valid for all 3 systemvm types.
When the IPv4 address of a Instance changes we need to make sure the
old entry is removed from the DHCP lease file on the Virtual Router
otherwise the Instance will still get the old lease.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This includes test related fixes and code review fixes based on
reviews from @rafaelweingartner, @marcaurele, @wido and @DaanHoogland.
This also includes VMware disk-resize limitation bug fix based on comments
from @sateesh-chodapuneedi and @priyankparihar.
This also includes the final changes to systemvmtemplate and fixes to
code based on issues found via test failures.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes test failures around VMware with the new systemvmtemplate.
In addition:
- Does not skip rVR related test cases for VMware
- Removes rc.local
- Processes unprocessed cmd_line.json
- Fixed NPEs around VMware tests/code
- On VMware, use udevadm to reconfigure nic/mac address than rebooting
- Fix proper acpi shutdown script for faster systemvm shutdowns
- Give at least 256MB of swap for VRs to avoid OOM on VMware
- Fixes smoke tests for environment related failures
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Several systemvmtemplate optimizations
- Uses new macchinina template for running smoke tests
- Switch to latest Debian 9.3.0 release for systemvmtemplate
- Introduce a new `get_test_template` that uses tiny test template
such as macchinina as defined test_data.py
- rVR related fixes and improvements
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactors and simplifies systemvm codebase file structures keeping
the same resultant systemvm.iso packaging
- Password server systemd script and new postinit script that runs
before sshd starts
- Fixes to keepalived and conntrackd config to make rVRs work again
- New /etc/issue featuring ascii based cloudmonkey logo/message and
systemvmtemplate version
- SystemVM python codebase linted and tested. Added pylint/pep to
Travis.
- iptables re-application fixes for non-VR systemvms.
- SystemVM template build fixes.
- Default secondary storage vm service offering boosted to have 2vCPUs
and RAM equal to console proxy.
- Fixes to several marvin based smoke tests, especially rVR related
tests. rVR tests to consider 3*advert_int+skew timeout before status
is checked.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This ports PR #1470 by @remibergsma.
Make the generated json files unique to prevent concurrency issues:
The json files now have UUIDs to prevent them from getting overwritten
before they've been executed. Prevents config to be pushed to the wrong
router.
2016-02-25 18:32:23,797 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) (logid:) Seq 2-4684025087442026584: Processing: { Ans: , MgmtId: 90520732674657, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.routing.GroupA
nswer":{"results":["null - success: null","null - success: [INFO] update_config.py :: Processing incoming file => vm_dhcp_entry.json.4ea45061-2efb-4467-8eaa-db3d77fb0a7b\n[INFO] Processing JSON file vm_dhcp_entry.json.4ea4506
1-2efb-4467-8eaa-db3d77fb0a7b\n"],"result":true,"wait":0}}] }
On the router:
2016-02-25 18:32:23,416 merge.py __moveFile:298 Processed file written to /var/cache/cloud/processed/vm_dhcp_entry.json.4ea45061-2efb-4467-8eaa-db3d77fb0a7b.gz
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactor cloud-early-config and make appliance specific scripts
- Make patching work without requiring restart of appliance and remove
postinit script
- Migrate to systemd, speedup booting/loading
- Takes about 5-15s to boot on KVM, and 10-30seconds for VMware and XenServer
- Appliance boots and works on KVM, VMware, XenServer and HyperV
- Update Debian9 ISO url with sha512 checksum
- Speedup console proxy service launch
- Enable additional kernel modules
- Remove unknown ssh key
- Update vhd-util URL as previous URL was down
- Enable sshd by default
- Use hostnamectl to add hostname
- Disable services by default
- Use existing log4j xml, patching not necessary by cloud-early-config
- Several minor fixes and file refactorings, removed dead code/files
- Removes inserv
- Fix dnsmasq config syntax
- Fix haproxy config syntax
- Fix smoke tests and improve performance
- Fix apache pid file path in cloud.monitoring per the new template
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Load the nf_conntrack_ipv6 module for IPv6 connection tracking on SSVM
- Move systemd services to /etc and enable services after they have been
installed
- Disable most services by default and enable in cloud-early-config
- Start services after enabling them using systemd
- In addition remove /etc/init.d/cloud as this is no longer needed and
done by systemd
- Accept DOS/MBR as file format for ISO images as well
Under Debian 7 the 'file' command would return:
debian-9.1.0-amd64-netinst.iso: ISO 9660 CD-ROM filesystem data UDF filesystem data
Under Debian 9 however it will return
debian-9.1.0-amd64-netinst.iso: DOS/MBR boot sector
This would make the HTTPTemplateDownloader in the Secondary Storage VM refuse the ISO as
a valid template because it's not a correct format.
Changes this behavior so that it accepts both.
This allows us to use Debian 9 as a System VM template.
Not sure though if enabling them is enough for systemd to still start them
on first boot
Signed-off-by: Wido den Hollander <wido@widodh.nl>
SystemVM changes to work on Debian 9
- Migrate away from chkconfig to systemctl
- Remove xenstore-utils override deb pkg
- Fix runlevel in sysv scripts for systemd
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
We found bug in ACL rules for PrivateGateway for VPC
At a glance - rules not applied - switching Allow All or Deny All (default ACL) - showed as completed - but rules missed.
Result - traffic via PrivateGateway blocked by next DROP rule in next chains
How to reproduce:
Enable PrivateGateway for Cloudstack
Create VPC
Provision new PrivateGateway inside VPC with some VLAN
Change ACL (optional step to show that problem not in initial configuration but in config itself)
Expected:
ACL rules applied (inserted) into correspondig ACL_INBOUND/OUTBOUND chanins for PrivateGateway interface (ethX) based on ACL which user choose
Current:
No rules inserted. ACL_INBOUND/OUTBOUND_ethX - empty. Traffic blocked by next DROP rule in FORWARD chain
Affect - all our corporate customers blocked with access to their own nets via PG and vice-versa.
Root cause:
Issue happened because of CsNetFilter.py logic for inserting rules for ACL_INBOUND/OUTBOUND chains.
We choose rule numebr to isnert right before last DROP rule - but forget about fact - that if chain empty - we also return 0 as insert position. Which not true for iptables - numeration started from 0.
So we need very small patch to handle this special case - if number of rules inside chain equal to zero - return 1, else - return count of rules inside chain.
It's found only one - just because be default for PrivateGateway - we didn't insert any "service rules" (if SourceNat for PrivateGteway not ticked) - and we have by default empty ACL_INBOUND/OUTBOUND chains. Because same insert happened for all VPC networks (but when we call this insert - we already have at least 1 rule inside chains - and we successfully can process)