Feature Specification: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653548
Live storage migration on KVM under these conditions:
From source and destination hosts within the same cluster
From NFS primary storage to NFS cluster-wide primary storage
Source NFS and destination NFS storage mounted on hosts
In order to enable this functionality, database should be updated in order to enable live storage capacibilty for KVM, if previous conditions are met. This is due to existing conflicts between qemu and libvirt versions. This has been tested on CentOS 6 hosts.
Additional notes:
To use this feature set the storage_motion_supported=1 in the hypervisor_capability table for KVM. This is done by default as the feature may not work in some environments, read below.
This feature of online storage+VM migration for KVM will only work with CentOS6 and possible Ubuntu as KVM hosts but not with CentOS7 due to:
https://bugs.centos.org/view.php?id=14026https://bugzilla.redhat.com/show_bug.cgi?id=1219541
On CentOS7 the error we see is: " error: unable to execute QEMU command 'migrate': this feature or command is not currently supported" (reference https://ask.openstack.org/en/question/94186/live-migration-unable-to-execute-qemu-command-migrate/). Reading through various lists looks like the migrate feature with qemu may be available with paid versions of RHEL-EV but not centos7 however this works with CentOS6.
Fix for CentOS 7:
Create repo file on /etc/yum.repos.d/:
[qemu-kvm-rhev]
name=oVirt rebuilds of qemu-kvm-rhev
baseurl=http://resources.ovirt.org/pub/ovirt-3.5/rpm/el7Server/
mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-3.5-el7Server
enabled=1
skip_if_unavailable=1
gpgcheck=0
yum install qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-img-ev-2.3.0-29.1.el7.x86_64
Reboot host
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Allows creating storage offerings associated with particular domain(s) and zone(s). In create disk/storage offfering form UI, a mult-select control has been addded to select desired zone(s) and domain select element has been made multi-select.
createDiskOffering API has been modified to allow passing list of domain and zone IDs with keys domainids and zoneids respectively. These lists are stored in DB in cloud.disk_offering_details table with 'domainids' and 'zoneids' key as string of comma separated list of IDs. Response for create, update and list disk offering APIs will return domainids, domainnames, zoneids and zonenames in details object of offering.
listDiskOfferings API has been modified to allow passing zoneid to return only offerings which are associated with the zone.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Keep connection alive when on maintenance
* Refactor cancel maintenance and unit tests
* Add marvin tests
* Refactor
* Changing the way we get ssh credentials
* Add check on SSH restart and improve marvin tests
* Migrate template to target host if needed.
Fix KVM VM local storage live migration by migrating its template to the
target host if needed.
* Address reviewer and add method that updates the DB template reference
* Remove deprecated Config.PrimaryStorageDownloadWait
* Code formating of @Inject to follow checkstyle
* - Offline VM and Volume migration on Vmware hypervisor hosts
- Also add VM disk consolidation call on successful VM migrations
* Fix indentation of marvin test file and reformat against PEP8
* * Fix few comment typos
* Refactor debug messages to use String.format() when debug log level is enabled.
* Send list of commands returned by hypervisor Guru instead of explicitly selecting the first one
* Fix unhandled NPE during VM migration
* Revert back to distinct event descriptions for VM to host or storage pool migration
* Reformat test_primary_storage file against PEP-8 and Remove unused imports
* Revert back the deprecation messages in the custom StringUtils class to favour the use of the ApacheUtils
A corner case was found on 4.11.2 for #2493 leading to an infinite loop in state PrepareForMaintenance
To prevent such cases, in which failed migrations are detected but still running on the host, this feature adds a new cluster setting host.maintenance.retries which is the number of retries before marking the host as ErrorInMaintenance if migration errors persist.
How Has This Been Tested?
- 2 KVM hosts, pick one which has running VMs as H
- Block migrations ports on H to simulate failures on migrations:
iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 49152:49215 -m comment --comment 'test block migrations' iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 16509 -m comment --comment 'test block migrations
- Put host H in Maintenance
- Observe that host is indefinitely in PrepareForMaintenance state (after this fix it goes into ErrorInMaintenance after retrying host.maintenance.retries times)
* CLOUDSTACK-9473: storage pool capacity check when volume is resized or migrated
Storage pool checker is not being called on resize and migrate volume.
This may lead to allocated percentage of storage above 100%.
Setup:
1 VMware cluster with 2 Hosts.
Executed Steps:
Applied the following global settings:
storage.overprovisioning.factor = 1
pool.storage.allocated.capacity.disablethreshold = 1
pool.storage.capacity.disablethreshold = 1
Restarted management server
Executed Resize and migrate pool and Observed that Storage pool checker is not performed on resizeVolume and migrateVolume.
Result:
Root cause analysis shows storage pool checker is not called when doing migration and resizing.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Cleaup and code-formatting POM files
* Remove obsolete mycila license-maven-plugin
* Remove obsolete console-proxy/plugin project
* Move console-proxy-rdbconsole under console-proxy parent
* Use correct parent path for rdpconsole
* Order alphabetally items in setnextversion.sh
* Unifiy License header in POMs
* Alphabetic order of modules definition
* Extract all defined versions into parent pom
* Remove obsolete files: version-info.in, configure-info.in
* Remove redundant defaultGoal
* Remove useless checkstyle plugin from checkstyle project
* Order alphabetally items in pom.xml
* Add aditional SPACEs to fix debian build
* Don't execute checkstyle on parent projects
* Use UTF-8 encoding in building checkstyle project
* Extract plugin versions into properties
* Execute PMD plugin on all the projects with -Penablefindbugs
* Upgrade maven plugins to latest version
* Make sure to always look for apache parent pom from repository
* Fix incorrect version grep in debian packaging
* Fix rebase conflicts
* Fix rebase conflicts
* Remove PMD for now to be fixed on another PR
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.
Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.
* [CLOUDSTACK-10323] Allow changing disk offering during volume migration
This is a continuation of work developed on PR #2425 (CLOUDSTACK-10240), which provided root admins an override mechanism to move volumes between storage systems types (local/shared) even when the disk offering would not allow such operation. To complete the work, we will now provide a way for administrators to enter a new disk offering that can reflect the new placement of the volume. We will add an extra parameter to allow the root admin inform a new disk offering for the volume. Therefore, when the volume is being migrated, it will be possible to replace the disk offering to reflect the new placement of the volume.
The API method will have the following parameters:
* storageid (required)
* volumeid (required)
* livemigrate(optional)
* newdiskofferingid (optional) – this is the new parameter
The expected behavior is the following:
* If “newdiskofferingid” is not provided the current behavior is maintained. Override mechanism will also keep working as we have seen so far.
* If the “newdiskofferingid” is provided by the admin, we will execute the following checks
** new disk offering mode (local/shared) must match the target storage mode. If it does not match, an exception will be thrown and the operator will receive a message indicating the problem.
** we will check if the new disk offering tags match the target storage tags. If it does not match, an exception will be thrown and the operator will receive a message indicating the problem.
** check if the target storage has the capacity for the new volume. If it does not have enough space, then an exception is thrown and the operator will receive a message indicating the problem.
** check if the size of the volume is the same as the size of the new disk offering. If it is not the same, we will ALLOW the change of the service offering, and a warning message will be logged.
We execute the change of the Disk offering as soon as the migration of the volume finishes. Therefore, if an error happens during the migration and the volume remains in the original storage system, the disk offering will keep reflecting this situation.
* Code formatting
* Adding a test to cover migration with new disk offering (#4)
* Adding a test to cover migration with new disk offering
* Update test_volumes.py
* Update test_volumes.py
* fix test_11_migrate_volume_and_change_offering
* Fix typo in Java doc
* CLOUDSTACK-8855 Improve Error Message for Host Alert State
* [CLOUDSTACK-9846] create column to save the content of alert messages
Remove declaration of throws CloudRuntimeException
I also removed some unused variables and comments left behind
This closes#837
* Isolate a problematic test "smoke/test_certauthority_root"
* [CLOUDSTACK-10314] Add Text-Field to each ACL Rule
It is interesting to have a text field (e.g. CHAR-256) added to each ACL rule, which allows to enter a "reason" for each FW Rule created. This is valuable for customer documentation, as well as best practice for an evidence towards auditing the system
* Formatting to make check style happy and code clean ups
Remove maven standard module (which only a few were using) and get ride of maven customization for the projects structure.
- moved all directories to src/main/java, src/main/resources, src/main/scripts, src/test/java, src/test/resources
- grep scan to search for src/com and src/org left over
- grep for <project>/scripts to fix pom.xml configuration
- remove custom <build> configuration in pom.xml
Signed-off-by: Marc-Aurèle Brothier <m@brothier.org>
Allowed zone-wide primary storage based on a custom plug-in to be added via the GUI in a KVM-only environment (previously this only worked for XenServer and VMware)
Added support for root disks on managed storage with KVM
Added support for volume snapshots with managed storage on KVM
Enable creating a template directly from a volume (i.e. without having to go through a volume snapshot) on KVM with managed storage
Only allow the resizing of a volume for managed storage on KVM if the volume in question is either not attached to a VM or is attached to a VM in the Stopped state.
Included support for Reinstall VM on KVM with managed storage
Enabled offline migration on KVM from non-managed storage to managed storage and vice versa
Included support for online storage migration on KVM with managed storage (NFS and Ceph to managed storage)
Added support to download (extract) a managed-storage volume to a QCOW2 file
When uploading a file from outside of CloudStack to CloudStack, set the min and max IOPS, if applicable.
Included support for the KVM auto-convergence feature
The compression flag was actually added in version 1.0.3 (1000003) as opposed to version 1.3.0 (1003000) (changed this to reflect the correct version)
On KVM when using iSCSI-based managed storage, if the user shuts a VM down from the guest OS (as opposed to doing so from CloudStack), we need to pass to the KVM agent a list of applicable iSCSI volumes that need to be disconnected.
Added a new Global Setting: kvm.storage.live.migration.wait
For XenServer, added a check to enforce that only volumes from zone-wide managed storage can be storage motioned from a host in one cluster to a host in another cluster (cannot do so at the time being with volumes from cluster-scoped managed storage)
Don’t allow Storage XenMotion on a VM that has any managed-storage volume with one or more snapshots.
Enabled for managed storage with VMware: Template caching, create snapshot, delete snapshot, create volume from snapshot, and create template from snapshot
Added an SIOC API plug-in to support VMware SIOC
When starting a VM that uses managed storage in a cluster other than the one it last was running in, we need to remove the reference to the iSCSI volume from the original cluster.
Added the ability to revert a volume to a snapshot
Enabled cluster-scoped managed storage
Added support for VMware dynamic discovery
This feature allows using templates and ISOs avoiding secondary storage as intermediate cache on KVM. The virtual machine deployment process is enhanced to supported bypassed registered templates and ISOs, delegating the work of downloading them to primary storage to the KVM agent instead of the SSVM agent.
Template and ISO registration:
- When hypervisor is KVM, a checkbox is displayed with 'Direct Download' label.
- API methods registerTemplate and registerISO are both extended with this new parameter directdownload.
- On template or ISO registration, no download job is sent to SSVM agent, CloudStack would only persist an entry on template_store_ref indicating that template or ISO has been marked as 'Direct Download' (bypassing Secondary Storage). These entries are persisted as:
template_id = Template or ISO id on vm_template table
store_id NULL
download_state = BYPASSED
state = Ready
(Note: these entries allow users to deploy virtual machine from registered templates or ISOs)
- An URL validation command is sent to a random KVM host to check if template/ISO location can be reached. Metalink are also supported by this feature. In case of a metalink, it is fetched and URL check is performed on each of its URLs.
- Checksum should be provided as indicated on #2246: {ALGORITHM}CHKSUMHASH
- After template or ISO is registered, it would be displayed in the UI
Virtual machine deployment:
When a 'Direct Download' template is selected for deployment, CloudStack would delegate template downloading to destination storage pool via destination host by a new pluggable download manager.
Download manager would handle template downloading depending on URL protocol. In case of HTTP, request headers can be set by the user via vm_template_details. Those details should be persisted as:
Key: HTTP_HEADER
Value: HEADERNAME:HEADERVALUE
In case of HTTPS, a new API method is added uploadTemplateDirectDownloadCertificate to allow user importing a client certificate into all KVM hosts' keystore before deployment.
After template or ISO is downloaded to primary storage, usual entry would be persisted on template_spool_ref indicating the mapping between template/ISO and storage pool.
This feature allow admins to dedicate a range of public IP addresses to the SSVM and CPVM, such that they can be subject to specific external firewall rules. The option to dedicate a public IP range to the System VMs (SSVM & CPVM) is added to the createVlanIpRange API method and the UI.
Solution:
Global setting 'system.vm.public.ip.reservation.mode.strictness' is added to determine if the use of the system VM reservation is strict (when true) or preferred (false), false by default.
When a range has been dedicated to System VMs, CloudStack should apply IPs from that range to
the public interfaces of the CPVM and the SSVM depending on global setting's value:
If the global setting is set to false: then CloudStack will use any unused and unreserved public IP
addresses for system VMs only when the pool of reserved IPs has been exhausted
If the global setting is set to true: then CloudStack will fail to deploy the system VM when the pool
of reserved IPs has been exhausted, citing the lack of available IPs.
UI Changes
Under Infrastructure -> Zone -> Physical Network -> Public -> IP Ranges, button 'Account' label is refactored to 'Set reservation'.
When that button is clicked, dialog displayed is also refactored, including a new checkbox 'System VMs' which indicates if range should be dedicated for CPVM and SSVM, and a note indicating its usage.
When clicking on button for any created range, UI dialog displayed indicates whether IP range is dedicated for system vms or not.
Allow security policies to apply on port groups:
- Accepts security policies while creating network offering
- Deployed network will have security policies from the network offering
applied on the port group (in vmware environment)
- Global settings as fallback when security policies are not defined for a network
offering
- Default promiscuous mode security policy set to REJECT as it's the default
for standard/default vswitch
Portgroup vlan-trunking options for dvswitch: This allows admins to define
a network with comma separated vlan id and vlan
range such as vlan://200-400,21,30-50 and use the provided vlan range to
configure vlan-trunking for a portgroup in dvswitch based environment.
VLAN overlap checks are performed for:
- isolated network against existing shared and isolated networks
- dedicated vlan ranges for the physical/public network for the zone
- shared network against existing isolated network
Allow shared networks to bypass vlan overlap checks: This allows admins
to create shared networks with a `bypassvlanoverlapcheck` API flag
which when set to 'true' will create a shared network without
performing vlan overlap checks against isolated network and against
the vlans allocated to the datacenter's physical network (vlan ranges).
Notes:
- No vlan-range overlap checks are performed when creating shared networks
- Multiple vlan id/ranges should include the vlan:// scheme prefix
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10046 digest helper for calculating checksums
* CLOUDSTACK-10046 cleanup unused checksum code
* CLOUDSTACK-10046 padding method proof of concept
* CLOUDSTACK-10046 only compare checksums if old value is valid
* Adding positive and negative tests for md5, sha-1 and sha-256, for xen, vmware and kvm hypervisors.
KVM Results:
Negative Test Passed - Exception Occurred Under template download ['Traceback (most recent call last):\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 189, in test_02_1_create_template_with_checksum_sha1_negative\n self.download(self.apiclient, template.id)\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 260, in download\n template.status)\n', 'Exception: Failed to download template: status - Failed post download script: checksum "{sha-1}bf580a13f791d86acf3449a7b457a91a14389264" didn\'t match the given value, "{sha-1}someInvalidValue"\n']
=== TestName: test_02_1_create_template_with_checksum_sha1_negative | Status : SUCCESS ===
=== TestName: test_02_create_template_with_checksum_sha1 | Status : SUCCESS ===.
Negative Test Passed - Exception Occurred Under template download ['Traceback (most recent call last):\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 203, in test_03_1_create_template_with_checksum_sha256_negative\n self.download(self.apiclient, template.id)\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 260, in download\n template.status)\n', 'Exception: Failed to download template: status - Failed post download script: checksum "{SHA-256}efc03633f2b8f5db08acbcc5dc1be9028572dfd8f1c6c8ea663f0ef94b458c5" didn\'t match the given value, "{SHA-256}someInvalidValue"\n']
=== TestName: test_03_1_create_template_with_checksum_sha256_negative | Status : SUCCESS ===
=== TestName: test_03_create_template_with_checksum_sha256 | Status : SUCCESS ===
Negative Test Passed - Exception Occurred Under template download ['Traceback (most recent call last):\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 217, in test_04_1_create_template_with_checksum_md5_negative\n self.download(self.apiclient, template.id)\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 260, in download\n template.status)\n', 'Exception: Failed to download template: status - Failed post download script: checksum "{md5}ada77653dcf1e59495a9e1ac670ad95f" didn\'t match the given value, "{md5}someInvalidValue"\n']
=== TestName: test_04_1_create_template_with_checksum_md5_negative | Status : SUCCESS ===
=== TestName: test_04_create_template_with_checksum_md5 | Status : SUCCESS ===
* CLOUDSTACK-10046 digest helper for calculating checksums
* CLOUDSTACK-10046 cleanup unused checksum code
* CLOUDSTACK-10046 padding method proof of concept
* CLOUDSTACK-10046 only compare checksums if old value is valid
* Adding positive and negative tests for md5, sha-1 and sha-256, for xen, vmware and kvm hypervisors.
KVM Results:
Negative Test Passed - Exception Occurred Under template download ['Traceback (most recent call last):\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 189, in test_02_1_create_template_with_checksum_sha1_negative\n self.download(self.apiclient, template.id)\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 260, in download\n template.status)\n', 'Exception: Failed to download template: status - Failed post download script: checksum "{sha-1}bf580a13f791d86acf3449a7b457a91a14389264" didn\'t match the given value, "{sha-1}someInvalidValue"\n']
=== TestName: test_02_1_create_template_with_checksum_sha1_negative | Status : SUCCESS ===
=== TestName: test_02_create_template_with_checksum_sha1 | Status : SUCCESS ===.
Negative Test Passed - Exception Occurred Under template download ['Traceback (most recent call last):\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 203, in test_03_1_create_template_with_checksum_sha256_negative\n self.download(self.apiclient, template.id)\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 260, in download\n template.status)\n', 'Exception: Failed to download template: status - Failed post download script: checksum "{SHA-256}efc03633f2b8f5db08acbcc5dc1be9028572dfd8f1c6c8ea663f0ef94b458c5" didn\'t match the given value, "{SHA-256}someInvalidValue"\n']
=== TestName: test_03_1_create_template_with_checksum_sha256_negative | Status : SUCCESS ===
=== TestName: test_03_create_template_with_checksum_sha256 | Status : SUCCESS ===
Negative Test Passed - Exception Occurred Under template download ['Traceback (most recent call last):\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 217, in test_04_1_create_template_with_checksum_md5_negative\n self.download(self.apiclient, template.id)\n', ' File "/Users/bstoyanov/Documents/sb2/cloudstack/test/integration/smoke/test_templates.py", line 260, in download\n template.status)\n', 'Exception: Failed to download template: status - Failed post download script: checksum "{md5}ada77653dcf1e59495a9e1ac670ad95f" didn\'t match the given value, "{md5}someInvalidValue"\n']
=== TestName: test_04_1_create_template_with_checksum_md5_negative | Status : SUCCESS ===
=== TestName: test_04_create_template_with_checksum_md5 | Status : SUCCESS ===
* Adding additional test with no checksum added when registering template
Result:
test_05_create_template_with_no_checksum (integration.smoke.test_templates.TestCreateTemplateWithChecksum) ... === TestName: test_05_create_template_with_no_checksum | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 1 test in 42.320s
OK
* Fixing negative tests exception handling
* Adding tests for ISO checksum validation and fixing a zero prefix failure test in templates
* CLOUDSTACK-10046 padding
* CLOUDSTACK-10046 usability additions
* yet another IDE artifact hindering checkstyle
Co-Authored-By: Prashanth Manthena <prashanth.manthena@nuagenetworks.net>
Co-Authored-By: Sigert Goeminne <sigert.goeminne@nuagenetworks.net>
Bug: https://issues.apache.org/jira/browse/CLOUDSTACK-9832
Detail:
When the VPC offering does not contain VpcVirtualRouter as a SourceNat provider,
then we will not add the interface in the public network to the VpcVR.
CLOUDSTACK-9832: Move isSrcNat check to VpcManager
This causes VM deployment failure on the host that was disabled while adding the storage repository.
In the attachCluster function of the PrimaryDataStoreLifeCycle, we were only selecting hosts that are up and are in enabled state. Here if we select all up hosts, it will populate the DB properly and will fix this issue. Also added a unit test for attachCluster function.
Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.
The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.
The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.
The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.
Signed-off-by: Abhinandan Prateek <abhinandan.prateek@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Default value of the account level global config vmsnapshot.expire.interval is -1 that conforms to legacy behaviour. A positive value will expire the VM snapshots for the respective account in that many hours.