* Create utility to centralize byte convertions
* Add/change toString definitions
* Create Libvirt handler to ScaleVmCommand
* Enable dynamic scalling VM with KVM
* Move config from interface to class and rename it
As every variable declared in interfaces are already final,
this moving will be needed to mock tests in nexts commits
* Configure VM max memory and cpu cores
The values are according to service offering or global configs
* Extract dpdk configuration to a method and test it
* Extract OS desc config to a method and test it
* Extract guest resource def to a method and test it
Improve libvirt def
* Refactor LibvirtVMDef.GuestResourceDef
* Refactor ScaleVmCommand
* Improve VMInstaVO toString()
* Refactor upgradeRunningVirtualMachine method
* Turn int variables into long on utility
* Verify if VM is scalable on KVMGuru
* Rename some KVMGuruTest's methods
* Change vm's xml to work with max memory
* Verify if service offering is dynamic before scale
* Create methods to retrieve data from domain
* Create def to hotplug memory
* Adjust the way command was scaling the VM
* Fix database persistence before executing command
* Send more info to host to improve log
* Fix var name
* Fix missing "}"
* Undo unnecessary changes
* Address review
* Fix scale validation
* Add VM prepared for dynamic scaling validation
* Refactor LibvirtScaleVmCommandWrapper and improve unit tests
* Remove duplicated method
* Add RuntimeException check
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Update ByteScaleUtilsTest.java
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
Currently we can send a default value of 4K/32K for GET/POST request of
user data field. Most new browsers and also nginx support till 1MB of
post data.
Added a new global setting `vm.userdata.max.length` with default value of
32KB which can be increased till 1MB.
I've upgraded a stage environment from an older 4.16-SNAPSHOT to the current one and found a regression bug at the VM migration.
When calling the migrateVirtualMachineWithVolume, the following InvalidParameterValueException is launched: Unsupported hypervisor: KVM for VM migration, we support XenServer/VMware/KVM only.
* Enhance log messages with hostName
* Use host.toString() on most of host logs.
* Remove redundant "Host" in logs and enhance logs
* duplicated "for"
* Adopt String.format, and enhance code
* Address reviews enhancing log messages
Update server/src/main/java/com/cloud/resource/ResourceManagerImpl.java
-- server/src/main/java/com/cloud/vm/UserVmManagerImpl.java
-- server/src/main/java/com/cloud/resource/RollingMaintenanceManagerImpl.java
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
* Fix String.format issue and change log message from debug to warn
* Fix checkstyle issue
* Fix string.format log
* Address review: enhance logs
* Enhance log of hosts in maintenance avoid list
* Remove "VM" on logs as vm.toString() already appends VM-<details>
* Add more details of the VM when postStateTransitionEvent
* Address reviewer and enhance VMInstanceVO.toString()
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
On API `resetSSHKeyForVirtualMachine`, ACS also regenerates VM password when it uses a template with `Password Enabled` as true; there is already anAPI to reset VM password, therefore, the reset SSH keys API should not reset the VM SSH password as well.
Besides running a meaningless process, the VM's password regeneration slows down the main process and may cause a confusion in operations due to password change in the VM without being explicity requested.
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
This PR fixes the issue that nic has wrong gateway after updating vm nic.
Steps to reproduce the issue
(1) create shared network (in advanced zone or advanced zone with sg)
(2) create new shared network (with same startip/endip/netmask, but different gateway).
(3) create a vm in new network
(4) stop vm and update vm nic ip address
Expected result:
The vm has correct gateway and netmask (of second network)
Actual result:
The vm has wrong gateway and netmask (of first network)
This PR introduces new granularity levels to configure VM dynamic scalability. Previously VM is configured to be dynamically scalable based on the template and global setting. Now we bringing this option to configure at service offering and VM level also.
VM can dynamically scale only when all flags are ON at VM level, template, service offering and global setting. If any of the flags is set to false then VM cannot be scalable. This result will be persisted in DB for each VM and will be honoured for that VM till it is updated.
We are introducing 'dynamicscalingallowed' parameter with permitted values of true or false for deployVM API and createServiceOffering API.
Following are the API parameter changes:
createServiceOffering API:
dynamicscalingenabled: an optional parameter of type Boolean with default value “true”.
deployVirtualMachine API:
dynamicscalingenabled: an optional parameter of type Boolean with default value “true”.
Following are the UI changes:
Service offering creation has ON/OFF switch for dynamic scaling enabled with default value true
Inclusivity changes for CloudStack
- Change default git branch name from 'master' to 'main' (post renaming/changing default git branch to 'main' in git repo)
- Rename some offensive words/terms as appropriate for inclusiveness.
This PR updates the default git branch to 'main', as part of #4887.
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes: #4990
When a VM associated with a backup offering is destroyed/expunged, the backup offering isn't unassigned, and despite the VM having no backups present, backup usage is generated. This PR prevent usage record generation when there are no backups present for a VM with a backup offering associated to it. This is done by ensuring that usage event for backups is generated only when a the backup size > 0
* server: fixes NPE on empty vmware.root.disk.controller config
When global config - vmware.root.disk.controller is set to empty and template is registered with deployasis, server will throw NPE while deploying a VM. This change fixes the problem by using default value of the config in this case.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* use StringUtils utility
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
* fix indentation
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
This NPE may happen when a VM is marked removed in the DB but not its
nics on a shared network. This can usually happen due to a failed
expunged VM or when an admin manually marks a VM as removed in DB but
does not cleanup the nics/network resources.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes: #4808, #4941
This PR adds a force flag to the attachIso / detachIso commands, especially for VMware where it is noticed that when trying to either detach an iso or attach an iso when there already exists another present it fails to do the necessary operation as from ACS end we either answer the question returned by Esxi for CDRom disconnect operation as No (for detach operation) or do not answer the question at all (for Attach operation).
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
This PR fixes the CLOUDSTACK-10434. I think some APIs lack access check and list them in below table. I also give the pattch to add the access check for the api in this table. Anyone chould change this table, If you think the APIs do not need access check and change their lable as "no".
API Lack?
VolumeApiServiceImpl # updateVolume yes
VolumeApiServiceImpl # detachVolumeViaDestroyVM yes
VolumeApiServiceImpl # takeSnapshot yes
VolumeApiServiceImpl # migrateVolume yes
AccountManagerImpl#createApiKeyAndSecretKey yes
LoadBalancingRulesManagerImpl#applyLBStickinessPolicy yes
LoadBalancingRulesManagerImpl#applyLBHealthCheckPolicy yes
TemplateManagerImpl#createPrivateTemplate yes
SnapshotManagerImpl#updateSnapshotPolicy
Co-authored-by: lujie <lujie@foxmail.com>
This PR fixes: #4462
Problem Statement:
In case of VMware, when a VM having multiple data disk is destroyed (without expunge) and tried to recover the VM then the previous data disks are not attached to the VM like before destroy. Only root disk is attached to the VM.
Root cause:
All data disks were removed as part of VM destroy. Only the volumes which are selected to delete (while destroying VM) are supposed to be detached and destroyed.
Solution:
During VM destroy, detach and destroy only volumes which are selected during VM destroy. Detach the other volumes during expunge of VM.
If VM details contain rootdisksize, volume entry in DB should reflect correct size when VM reset is performed.
Fixes#3957
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
This PR fixes the issue pertaining to volume resize on VMWare for deploy as-is templates. VMware deploy as-is templates are those that are deployed as per the specification in the imported OVF. Hence override root disk size will not be adhered to for such templates. Moreover, when we deploy VMs in stopped state and resize the volume, the root disk doesn't get resized but the volume size is merely updated in the DB.
This PR also includes the following (for deploy as-is templates):
- Disables overriding root disk size during VM deployment on the UI
- Disables selection of compute offerings with root disk size specified, at the time of deployment
- Provided users with the option to deploy VM is stopped state via UI (so as to give an option to users to resize the volumes before starting the VM)
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
* Updated libvirt's native reboot operation for VM on KVM using ACPI event, and Added 'forced' reboot option to stop and start the VM (using rebootVirtualMachine API)
* Added 'forced' reboot option for System VM and Router
- New parameter 'forced' in rebootSystemVm API, to stop and then start System VM
- New parameter 'forced' in rebootRouter API, to force stop and then start Router
* Added force reboot tests for User VM, System VM and Router
Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag).
Please find more details in the FS here:
https://cwiki.apache.org/confluence/x/cDl4CQ
Documentation PR: apache/cloudstack-documentation#169
This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack
Other improvements addressed in addition to PowerFlex/ScaleIO support:
- Added support for config drives in host cache for KVM
=> Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
=> Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
=> Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
=> Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path
=> Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
- Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
- Updated full deployment destination for preparing the network(s) on VM start
- Propagate the direct download certificates uploaded to the newly added KVM hosts
- Discover the template size for direct download templates using any available host from the zones specified on template registration
=> When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
- Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
- Retry VM deployment/start when the host cannot grant access to volume/template
- Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
=> Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
- Check the router filesystem is writable or not, before performing health checks
=> Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
=> The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
=> Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
- Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s)
* Addressed some issues for few operations on PowerFlex storage pool.
- Updated migration volume operation to sync the status and wait for migration to complete.
- Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM.
- Added sync lock while spooling managed storage template before volume creation from the template (non-direct download).
- Updated resize volume error message string.
- Blocked the below operations on PowerFlex storage pool:
-> Extract Volume
-> Create Snapshot for VMSnapshot
* Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires.
- The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes.
Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html
Other fixes included:
- Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up)
- Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates.
- Perform basic tests (for connectivity and file system) on router before updating the health check config data
=> Validate the basic tests (connectivity and file system check) on router
=> Cleanup the health check results when router is destroyed
* Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0
* UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool.
- PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage
- Updated protocol to "custom" for PowerFlex provider
- Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool
and Minor improvements in PowerFlex/ScaleIO storage plugin code
* Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances.
- findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool.
- Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance.
- Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools.
- Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported.
* Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py
* Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots)
* Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support.
* Fix to remove the duplicate zone wide pools listed while finding storage pools for migration
* Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure
* Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure
* Show network name in exception message
* Update server/src/main/java/com/cloud/vm/UserVmManagerImpl.java
Co-authored-by: dahn <daan.hoogland@gmail.com>
- Fixes inter-cluster migration of VMs
- Allows migration of stopped VM with disks attached to different and suitable pools
- Improves inter-cluster detached volume migration
- Allows inter-cluster migration (clusters of same Pod) for system VMs, VRs on VMware
- Allows storage migration for stopped system VMs, VRs on VMware within same Pod if StoragePool cluster scopetype
Linked Primate PR: https://github.com/apache/cloudstack-primate/pull/789 [Changes merged in this PR after new UI merge]
Documentation PR: https://github.com/apache/cloudstack-documentation/pull/170
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* 4.14:
server: select root disk based on user input during vm import (#4591)
kvm: Use Q35 chipset for UEFI x86_64 (#4576)
server: fix wrong error message when create isolated network without SourceNat (#4624)
server: add possibility to scale vm to current customer offerings (#4622)
server: keep networks order and ips while move a vm with multiple networks (#4602)
server: throw exception when update vm nic on L2 network (#4625)
doc: fix typo in install notes (#4633)
This PR fixes an issue when move a vm from an account to another account.
Steps to reproduce the issue
(1) create a vm with multiple shared networks (in advanced zone, or advanced zone with security groups)
(2) create another account (in same domain who can also access the shared networks)
(3) move vm to new account, with a list of networkid
expected result: the vm has nics on the networks in same order as specified in API request, and nics have the same ips as before actual result: network order is not same as specified, ips are changed.