cloudstack

mirror of https://github.com/apache/cloudstack.git synced 2025-10-26 01:32:18 +02:00

Author	SHA1	Message	Date
Abhishek Kumar	0b5a5e8043	api,agent,server,engine-schema: scalability improvements (#9840 ) * api,agent,server,engine-schema: scalability improvements Following changes and improvements have been added: - Improvements in handling of PingRoutingCommand 1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs. 2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch 3. Optimized scanning stalled VMs - Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers` - Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine - Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0. - Added caching for account/use role API access with expiration after write set to 60 seconds. - Added caching for some recurring DB retrievals 1. CapacityManager - listing service offerings - beneficial in host capacity calculation 2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins 3. DownloadListener - hypervisors for zone - beneficial for host joins 5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands - Optimized MS list retrieval for agent connect - Optimize finding ready systemvm template for zone - Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks - Changes in agent-agentmanager connection with NIO client-server classes 1. Optimized the use of the executor service 2. Refactore Agent class to better handle connections. 3. Do SSL handshakes within worker threads 5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end. 6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used. - Improvements in StatsCollection - minimize DB retrievals. - Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals. - Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools. - Minor improvements in resource limit calculations wrt DB retrievals Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com> * test1, domaindetails, capacitymanager fix Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * test2 - agent tests Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * capacitymanagertest fix Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * change Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * fix missing changes Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * address comments Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * revert marvin/setup.py Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * fix indent Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * use space in sql Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * address duplicate Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * update host logs Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * revert e36c6a5d07 Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * fix npe in capacity calculation Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * move schema changes to 4.20.1 upgrade Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * build fix Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * address comments Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * fix build Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * add some more tests Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * checkstyle fix Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * remove unnecessary mocks Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * build fix Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * replace statics Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * engine/orchestration,utils: limit number of concurrent new agent connections Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * refactor - remove unused Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * unregister closed connections, monitor & cleanup Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * add check for outdated vm filter in power sync Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * agent: synchronize sendRequest wait Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> --------- Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2025-02-01 12:28:41 +05:30
Vishesh	a4224e58cc	Improve logging to include more identifiable information (#9873 ) * Improve logging to include more identifiable information for kvm plugin * Update logging for scaleio plugin * Improve logging to include more identifiable information for default volume storage plugin * Improve logging to include more identifiable information for agent managers * Improve logging to include more identifiable information for Listeners * Replace ids with objects or uuids * Improve logging to include more identifiable information for engine * Improve logging to include more identifiable information for server * Fixups in engine * Improve logging to include more identifiable information for plugins * Improve logging to include more identifiable information for Cmd classes * Fix toString method for StorageFilterTO.java	2025-01-06 16:42:37 +05:30
Fabricio Duarte	2c412f8947	Adjust misformatted logs (#9889 ) * Adjust a Quota balance calculation log * Fix another log	2024-11-28 14:43:26 -03:00
Nicolas Vazquez	8c8d115a1e	feature: Support Multi-arch Zones (#9619 ) This introduces the multi-arch zones, allowing users to select the VM arch upon deployment. Multi-arch zone support in CloudStack can allow admins to mix x86_64 & arm64 hosts within the same zone with the following changes proposed: - All hosts in a clusters need to be homogenous, wrt host CPU type (amd64 vs arm64) and hypevisor - Arch-aware templates & ISOs: - Add support for a new arch field (default set of: amd64 and arm64), when unspecified defaults to amd64 and for existing templates & iso - Allow admins to edit the arch type of the registered template & iso - Arch-aware clusters and host: - Add new attribute field for cluster and hosts (kvm host agents can automatically report this, arch of the first host of the cluster is cluster's architecture), defaults to amd64 when not specified - Allow admins to edit the arch of an existing cluster - VM deployment form (UI): - In a multi-arch zone/env, the VM deployment form can allow some kind of template/iso filtration in the UI - Users should be able to select arch: amd64 & arm64; but this is shown only in a multi-arch zone (env) - VM orchestration and lifecycle operations: - Use of VM/template's arch to correctly decide where to provision the VM (on the correct strictly arch-matching host/clusters) & other lifecycle operations (such as migration from/to arch-matching hosts) Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2024-09-06 12:14:54 +05:30
Vishesh	0ec7c72875	Merge branch '4.19'	2024-07-01 12:41:45 +05:30
Suresh Kumar Anaparti	2ca1b474bd	PowerFlex/ScaleIO SDC client connection improvements (#9268 ) * Mitigation for non-scalable Powerflex/ScaleIO clients - Added ScaleIOSDCManager to manage SDC connections, checks clients limit, prepare and unprepare SDC on the hosts. - Added commands for prepare and unprepare storage clients to prepare/start and stop SDC service respectively on the hosts. - Introduced config 'storage.pool.connected.clients.limit' at storage level for client limits, currently support for Powerflex only. * tests issue fixed * refactor / improvements * lock with powerflex systemid while checking connections limit * updated powerflex systemid lock to hold till sdc preparation * Added custom stats support for storage pool, through listStoragePools API * code improvements, and unit tests * unit tests fixes * Update config 'storage.pool.connected.clients.limit' to dynamic, and some improvements * Stop SDC on host after migration if no volumes mapped to host * Wait for SDC to connect after scini service start, and some log improvements * Do not throw exception (log it) when SDC is not connected while revoking access for the powerflex volume * some log improvements	2024-06-29 10:01:50 +05:30
Vishesh	f6ceeab3b3	server: Enforce strict host tag check (#9017 ) Documentation PR: https://github.com/apache/cloudstack-documentation/pull/398 Currently, an administrator can break host tag compatibility for a VM administrator by certain operations: * deploy/start VM on a specific host * migrate VM * restore VM * scale VM This PR allows the user to specify tags which must be checked during these operations. Global Settings 1. `vm.strict.host.tags` - A comma-separated list of tags for strict host check (Default - empty) 2. `vm.strict.resource.limit.host.tag.check` - Determines whether the resource limits tags are considered strict or not (Default - true) During the above operations, we now check and throw an error if host tags compatibility is being broken for tags specified in `vm.strict.host.tags`. If `vm.strict.resource.limit.host.tag.check` is set to `true`, tags set in `resource.limit.host.tags` are also checked during these operations.	2024-06-25 14:42:17 +05:30
Daan Hoogland	373f017002	Merge branch '4.19'	2024-06-18 19:58:43 +02:00
Daan Hoogland	050ee44137	Merge branch '4.18' into 4.19	2024-06-18 16:05:45 +02:00
Bryan Lima	00fe25ab01	Fix allocation of VMs with multiple clusters (#8611 ) * Fix allocation of VMs with multiple clusters * Readd debug guard	2024-06-14 13:54:01 +03:00
Suresh Kumar Anaparti	6fda757936	While starting VM with 'considerlasthost' enabled, don't load host tags/details for the last host when it doesn't exist [main] (#9063 )	2024-06-12 17:03:18 +02:00
Suresh Kumar Anaparti	4e7c6682fd	While starting VM with considerlasthost enabled, don't load host tags/details for the last host when it doesn't exist (#9037 )	2024-06-12 07:49:03 +02:00
SadiJr	6f27b1f459	Improve logs when adding components to avoid set (#7214 ) Co-authored-by: SadiJr <sadi@scclouds.com.br> Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com> Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>	2024-02-28 08:49:10 +01:00
Abhishek Kumar	592038a304	api,server,ui: granular resource limit management (#8362 ) Feature spec: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Granular+Resource+Limit+Management Introduces the concept of tagged resource limits for granular resource limit management. Limits can be enforced on accounts and domains for the deployment of entities for a tagged resource. Current tagged resource limits can be used for the following resource types, Host limits - user_vm - cpu - memory Storage limits - volume - primary_storage Following global settings can used to specify tags for which limit needs to be enforced, Host: `resource.limit.host.tags` Storage: `resource.limit.storage.tags` Option for specifying tagged resource limits and viewing tagged resource usage are made available in the UI. Enhances the use of templatetag for VM deployment and template creation Adds option to list service/compute offerings that can be used with a given template. A new parameter named templateid has been added. Adds option to list disk offering with suitability flag for a virtual machine. A new parameter named virtualmachineid has been added to the listDiskOfferings API which when passed returns suitableforvirtualmachine param in the response.	2024-02-19 14:17:34 +05:30
João Jandre	49cecaed06	Normalize loggers and upgrade log4j 1.2 to log4j 2.19 (#7131 ) * Normalize logs All classes that could have their loggers inherited from their fathers had their own loggers deleted; Most loggers didn't have to be static, so most of them were normalized so that they wouldn't be; All loggers are protected now; Static logger's name are now 'LOGGER'; Non-static logger's name are now 'logger'; New class DbUpgradeAbstractImpl created so that all Upgraders extend it and inherit its logger * Upgrade log4j * fix errors caused by the merge * Refactor cglibThrowableRenderer functionality to log4j2 and upgrade the last configuration files * fix sonarcloud bug * Fix errors caused by merge, remove some unused loggers, and rename a variable that was mistakenly renamed on the normalization commit * Readd snmpTrapAppender, remove TestAppender * Regenerate changes * regenerate changes * refactor last custom appender * fix systemvm configuration xml * Regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * regenerate changes * regenerate changes * regenerate changes * Fix utils pom * fix some tests * regenerate changes * Fix jar being printed on exception * fix logging in system VMs, fix commands not having log4j2 classpath. * regenerate changes * Fix some unwanted renomeations * fix end of file * regenerate changes * regenerate changes * fix merge error * regenerate changes * fix tests * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * readd reload4j to tungsten as juniper depends on it * Regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * re-add reload4j dependency to network-contrail, as juniper depends on it * regenerate changes * regenerate changes * regenerate changes * fix typo * regenerate changes * regenerate changes * Fix end of files * regenerate changes * add logj42 to cloud-utils-SHADED.jar * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * Regenerate changes * Regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * Regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * Regenerate changes * fix some tests * Regenerate changes * Regenerate changes * fix test * Regenerate changes * Regenerate changes	2024-02-08 09:55:41 -03:00
João Jandre	26b01f6f3b	Flexible tags for hosts and storage pools (#7489 ) Co-authored-by: João Jandre <joao@scclouds.com.br>	2023-11-30 09:36:47 +01:00
Vishesh	a31f211628	Merge remote-tracking branch 'remote/4.18'	2023-11-29 16:12:51 +05:30
anniejili	3c7c75bacf	Clear pool id if volume allocation fails (#8202 ) * clear pool id if volume allocation fails and leave volume state as Allocated with a pool id assigned * clear_pool_id_if_volume_allocation_fails --------- Co-authored-by: Annie Li <ji_li@apple.com>	2023-11-21 15:41:04 +05:30
Abhishek Kumar	e234c3ccdc	server: guard vm start inter-cluster migration with config (#7401 ) During the start of a stopped VM when there is not enough capacity in the current cluster CloudStack can migrate it to a new cluster. This can be an expensive operation when Cluster scope storage is used as migration can be carried out using SSVM and secondary storage. This PR allows controlling this behaviour with the existing global config - `migrate.vm.across.clusters` Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2023-05-08 12:08:57 +05:30
Daan Hoogland	16694d8bec	Merge branch '4.17' into 4.18	2023-03-29 17:36:55 +02:00
Harikrishna	9fb20056d5	Fixed avoid set variables which is causing deployment failures (#7372 )	2023-03-29 17:14:18 +02:00
Wei Zhou	889045fba5	new plugins: Add non-strict affinity groups (#6845 )	2022-12-20 15:09:52 +01:00
Marcus Sorensen	697e12f8f7	kvm: volume encryption feature (#6522 ) This PR introduces a feature designed to allow CloudStack to manage a generic volume encryption setting. The encryption is handled transparently to the guest OS, and is intended to handle VM guest data encryption at rest and possibly over the wire, though the actual encryption implementation is up to the primary storage driver. In some cases cloud customers may still prefer to maintain their own guest-level volume encryption, if they don't trust the cloud provider. However, for private cloud cases this greatly simplifies the guest OS experience in terms of running volume encryption for guests without the user having to manage keys, deal with key servers and guest booting being dependent on network connectivity to them (i.e. Tang), etc, especially in cases where users are attaching/detaching data disks and moving them between VMs occasionally. The feature can be thought of as having two parts - the API/control plane (which includes scheduling aspects), and the storage driver implementation. This initial PR adds the encryption setting to disk offerings and service offerings (for root volume), and implements encryption support for KVM SharedMountPoint, NFS, Local, and ScaleIO storage pools. NOTE: While not required, operations can be significantly sped up by ensuring that hosts have the `rng-tools` package and service installed and running on the management server and hypervisors. For EL hosts the service is `rngd` and for Debian it is `rng-tools`. In particular, the use of SecureRandom for generating volume passphrases can be slow if there isn't a good source of entropy. This could affect testing and build environments, and otherwise would only affect users who actually use the encryption feature. If you find tests or volume creates blocking on encryption, check this first. ### Management Server ##### API * createDiskOffering now has an 'encrypt' Boolean * createServiceOffering now has an 'encryptroot' Boolean. The 'root' suffix is added here in case there is ever any other need to encrypt something related to the guest configuration, like the RAM of a VM. This has been refactored to deal with the new separation of service offering from disk offering internally. * listDiskOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption * listServiceOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption * listHosts now shows encryption support of each hypervisor host via `encryptionsupported` * Volumes themselves don't show encryption on/off, rather the offering should be referenced. This follows the same pattern as other disk offering based settings such as the IOPS of the volume. ##### Volume functions A decent effort has been made to ensure that the most common volume functions have either been cleanly supported or blocked. However, for the first release it is advised to mark this feature as experimental, as the code base is complex and there are certainly edge cases to be found. Many of these features could eventually be supported over time, such as creating templates from encrypted volumes, but the effort and size of the change is already overwhelming. Supported functions: * Data Volume create * VM root volume create * VM root volume reinstall * Offline volume snapshot/restore * Migration of VM with storage (e.g. local storage VM migration) * Resize volume * Detach/attach volume Blocked functions: * Online volume snapshot * VM snapshot w/memory * Scheduled snapshots (would fail when VM is running) * Disk offering migration to offerings that don't have matching encryption * Creating template from encrypted volume * Creating volume from encrypted volume * Volume extraction (would we decrypt it first, or expose the key? Probably the former). ##### Primary Storage Support For storage developers, adding encryption support involves: 1. Updating the `StoragePoolType` for your primary storage to advertise encryption support. This is used during allocation of storage to match storage types that support encryption to storage that supports it. 2. Implementing encryption feature when your `PrimaryDataStoreDriver` is called to perform volume lifecycle functions on volumes that are requesting encryption. You are free to do what your storage supports - this could be as simple as calling a storage API with the right flag when creating a volume. Or (as is the case with the KVM storage types), as complex as managing volume details directly at the hypervisor host. The data objects passed to the storage driver will contain volume passphrases, if encryption is requested. ##### Scheduling For the KVM implementations specified above, we are dependent on the KVM hosts having support for volume encryption tools. As such, the hosts `StartupRoutingCommand` has been modified to advertise whether the host supports encryption. This is done via a probe during agent startup to look for functioning `cryptsetup` and support in `qemu-img`. This is also visible via the listHosts API and the host details in the UI. This was patterned after other features that require hypervisor support such as UEFI. The `EndPointSelector` interface and `DefaultEndpointSelector` have had new methods added, which allow the caller to ask for endpoints that support encryption. This can be used by storage drivers to find the proper hosts to send storage commands that involve encryption. Not all volume activities will require a host to support encryption (for example a snapshot backup is a simple file copy), and this is the reason why the interface has been modified to allow for the storage driver to decide, rather than just passing the data objects to the EndpointSelector and letting the implementation decide. VM scheduling has also been modified. When a VM start is requested, if any volume that requires encryption is attached, it will filter out hosts that don't support encryption. ##### DB Changes A volume whose disk offering enables encryption will get a passphrase generated for it before its first use. This is stored in the new 'passphrase' table, and is encrypted using the CloudStack installation's standard configured DB encryption. A field has been added to the volumes table, referencing this passphrase, and a foreign key added to ensure passphrases that are referenced can't be removed from the database. The volumes table now also contains an encryption format field, which is set by the implementer of the encryption and used as it sees fit. #### KVM Agent For the KVM storage pool types supported, the encryption has been implemented at Qemu itself, using the built-in LUKS storage support. This means that the storage remains encrypted all the way to the VM process, and decrypted before the block device is visible to the guest. This may not be necessary in order to implement encryption for /your/ storage pool type, maybe you have a kernel driver that decrypts before the block device on the system, or something like that. However, it seemed like the simplest, common place to terminate the encryption, and provides the lowest surface area for decrypted guest data. For qcow2 based storage, `qemu-img` is used to set up a qcow2 file with LUKS encryption. For block based (currently just ScaleIO storage), the `cryptsetup` utility is used to format the block device as LUKS for data disks, but `qemu-img` and its LUKS support is used for template copy. Any volume that requires encryption will contain a passphrase ID as a byte array when handed down to the KVM agent. Care has been taken to ensure this doesn't get logged, and it is cleared after use in attempt to avoid exposing it before garbage collection occurs. On the agent side, this passphrase is used in two ways: 1. In cases where the volume experiences some libvirt interaction it is loaded into libvirt as an ephemeral, private secret and then referenced by secret UUID in any libvirt XML. This applies to things like VM startup, migration preparation, etc. 2. In cases where `qemu-img` needs to use this passphrase for volume operations, it is written to a `KeyFile` on the cloudstack agent's configured tmpfs and passed along. The `KeyFile` is a `Closeable` and when it is closed, it is deleted. This allows us to try-with-resources any volume operations and get the KeyFile removed regardless. In order to support the advanced syntax required to handle encryption and passphrases with `qemu-img`, the `QemuImg` utility has been modified to support the new `--object` and `--image-opts` flags. These are modeled as `QemuObject` and `QemuImageOptions`. These `qemu-img` flags have been designed to supersede some of the existing, older flags being used today (such as choosing file formats and paths), and an effort could be made to switch over to these wholesale. However, for now we have instead opted to keep existing functions and do some wrapping to ensure backward compatibility, so callers of `QemuImg` can choose to use either way. It should be noted that there are also a few different Enums that represent the encryption format for various purposes. While these are analogous in principle, they represent different things and should not be confused. For example, the supported encryption format strings for the `cryptsetup` utility has `LuksType.LUKS` while `QemuImg` has a `QemuImg.PhysicalDiskFormat.LUKS`. Some additional effort could potentially be made to support advanced encryption configurations, such as choosing between LUKS1 and LUKS2 or changing cipher details. These may require changes all the way up through the control plane. However, in practice Libvirt and Qemu currently only support LUKS1 today. Additionally, the cipher details aren't required in order to use an encrypted volume, as they're stored in the LUKS header on the volume there is no need to store these elsewhere. As such, we need only set the one encryption format upon volume creation, which is persisted in the volumes table and then available later as needed. In the future when LUKS2 is standard and fully supported, we could move to it as the default and old volumes will still reference LUKS1 and have the headers on-disk to ensure they remain usable. We could also possibly support an automatic upgrade of the headers down the road, or a volume migration mechanism. Every version of cryptsetup and qemu-img tested on variants of EL7 and Ubuntu that support encryption use the XTS-AES 256 cipher, which is the leading industry standard and widely used cipher today (e.g. BitLocker and FileVault). Signed-off-by: Marcus Sorensen <mls@apple.com> Co-authored-by: Marcus Sorensen <mls@apple.com>	2022-09-27 10:20:59 +05:30
John Bampton	f9347ecf2c	Fix spelling (#6597 )	2022-08-03 15:43:47 +05:30
John Bampton	7d23a0a759	Fix spelling (#6272 )	2022-07-05 09:08:53 +02:00
Nicolas Vazquez	82e0d5d679	Fix UEFI detection on KVM and prevent deployments on non UEFI enabled hosts (#6423 ) * Do not allow UEFI deployments on non UEFI enabled hosts * Fix UEFI detection on KVM * Refactor * Improvement	2022-05-31 14:31:42 -03:00
Suresh Kumar Anaparti	545f85936a	Merge branch '4.16' into main	2022-02-17 14:28:26 +05:30
Wei Zhou	c543f5f546	server: reapply checkVmProfileAndHost to check guest os preference (#6000 )	2022-02-17 14:25:13 +05:30
Suresh Kumar Anaparti	208ae84dd7	Merge branch '4.16' into main	2022-02-08 19:01:34 +05:30
dahn	0f1cd6009d	add logging to deployment planners (#5859 ) Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com> Co-authored-by: Daan Hoogland <dahn@onecht.net> Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com> Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>	2022-02-04 17:02:32 +01:00
Harikrishna	f15cab16da	server: Decouple service (compute) offering and disk offering (#5008 ) Currently, our compute offerings and disk offerings are tightly coupled with respect to many aspects. For example, if a compute offering is created, a corresponding disk offering entry is also created with the same ID as the reference. Also creating compute offering takes few disk-related parameters which anyway goes to the corresponding disk offering only. I think this design was initially made to address compute offering for the root volume created from a template. Also changing the offering of a volume is tightly coupled with storage tags and has to be done in different APIs either migrateVolume or resizeVolume. Changing of disk offering should be seamless and should consider new storage tags, new size and place the volume in appropriate state as defined in disk offering. more details are mentioned here https://cwiki.apache.org/confluence/display/CLOUDSTACK/Compute+offering+and+disk+offering+refactoring * Schema changes and disk offering column change from "type" to "compute_only" * Few more changes * Decoupled service offering and disk offering * Remove diskofferingid from vminstance VO * Decouple service offering and disk offering states * diskoffering getsize() is only for strict disk offerings * Fix deployVM flow * Added new API params to compute offering creation * Add diskofferingstrictness to serviceoffering vo under quota * Added overrideDiskOfferingId parameter in deploy VM API which will override disk offering for the root disk both in template and ISO case Added diskSizeStrictness parameter in create Disk offering API which will decide whether to restrict resize or disk offering change of a volume * Fix User vm response to show proper service offering and disk offerings * Added disk size strictness in disk offering response * Added disk offering strictness to the service offering response * Remove comments * Added UI changes for Disk offering strictness in add compute offering form and Disk size strictness in add disk offering form * Added diskoffering details to the service offering response * Added UI changes in deployvm wizard to accept override disk offering id * Fix delete compute offering * Fix VM deployment from custom service offering * Move uselocalstorage column access from service offering to disk offering * UI: Separated compute and disk releated parameters in add compute offering wizard, also added association to disk offering * Fixed diskoffering automatic selection on add compute offering wizard * UI: move compute only toggle button outside the box in add compute offering wizard * Added volumeId parameter to listDiskOfferings API and the disksizestrictness flag of the current disk offering is honored while list disk offerings * Added configuration parameter to decide whether to check volume tags on the destination storagepool during migration * Added disk offering change checks during resize volume operation * Added new API changeofferingforVolume API and corresponding changes * Add UI form for changeOfferingForVolume API * Fix UI conflicts * Fix service offering usage as disk offering * Fix unit test failures * fix user_vm_view * Addressed review comments * Fixed service_offering_view * Fix service offering edit flow * Fix service offering constructor to address custom offering * Fix domain_router_view to get proper service offering id * Removed unused import * Addressed review comments and fixed update service offering flow with storage tags * Added marvin test cases for checking disk offering strictness * review comments addressed * Remove system_use column from disk offering join * update volume_view to update system_use column from service offering and not disk offering * Fix changeOfferingForVolume API for custom disk offering * Fix global setting implementation * Fix list volumes, after changing system_use column from disk offering to service offering in volume_view * Changes for override root disk offering in deployvm wizard in case of custom offering * Fix a unit test case * Fixed recent unit test cases with new serviceofferingvo constructor * Fix unit test in VolumeApiServiceImpl * Added storage id for the list disk offering API and corresponding UI changes in migrateVolume and changeOfferingForVolume flow * Rename global configuration parameter from storage.pool.tags.disk.offering.strictness to match.storage.pool.tags.with.disk.offering * Fix smoke test failures * Added tool tip for migrate volume UI form * Address review comments and fix UI form of deploy VM in case of ISO. * Fixed resize volume UI form for data disk * UI changes to disable override root disk size when override root disk offering is enabled * UI fix in deploy vm wizard * Fix listdiskoffering after rebasing with main * Fixed UI in migrate and changeofferingfor volume to handle empty disk offering list Removed the volume's current disk offering from listDiskOffering response list * Added custom Iops to resize volume form and removed the current disk offering during change offering for volume UI form * Fix false response on updateDiskOffering API * Added search field for changeofferingforvolume UI form * Fix resize volume and migrate volume to update volume path if DRS is applied on volume in datastore cluster * Removed DB changes from 4.16 upgrade file * Resolving merge conflicts with main 4.17 * Added support for auto migration and auto resize of the root volume upon changing the service offering for VM. * UI: Added automigrate checkbox in scale VM form * Addes since attributes to new API params * Added shrinkOK parameter to changeofferingforvolume API * Added shrinkOk param to UI in changeOfferingforVolume form * Added shrinkOk flag to scaleVM and changeServiceForVirtualMachines and UI form * Removed old foreign key constraint on IDs of service offering and disk offering * Allow resize and automigrate of root volume if required in all cases of service offering change * Allow only resize to higher disk size from UI * Fixing vue syntax error * Make UI changes to provide root disk size box when the linked disk offering is of custom * Converted from check box to toggle in scale VM, changeoffering, resize and migrate volume forms * Fix resize volume operation to update the VM settings * Fix migratevolume form to pick selected storage pool id in list diskofferings API	2022-01-27 15:08:42 +05:30
Daniel Augusto Veronezi Salvador	b4aabadc4d	Replace string libraries with org.apache.commons.lang3.StringUtils (#5386 ) * Replace google lib for lang3 and adjust methods calls * Replace string libs by lang3 * Prohibit others string libs Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>	2021-11-18 13:41:48 +05:30
Wei Zhou	669ab73efe	server: check service offering (storage) tags when reallocate a ROOT disk (#5501 ) * server: check service offering (storage) tags when reallocate a ROOT disk * server: resize volumes in Allocated state	2021-10-03 19:45:59 -03:00
DK101010	664a46a525	PR multi tags in compute offering [#4398 ] (#4399 ) * [#4398] adapt code to handle multi tag string with commas * [#4398] remove trailing spaces * [#4398] add multi host tag support for ingest process * [#4398] add test for multi tag support in offerings * [#4398] update multitag support for DeploymentPlanningManagerImpl encapsulate multi tag check from Ingest Feature, DepolymentPlanningManager into HostDaoImpl to prevent code duplicates * [#4398] move logic to HostVO and add tests * rename test method * [#4398] Change string method to apaches StringUtils * [#4398] modify test for multi tag support * adapt sql for double tags Co-authored-by: Dirk Klahre <Dirk.Klahre@Itelligence.de>	2021-08-16 12:08:40 -03:00
Rakesh	2a4c2c2506	Global setting to select preferred storage pool (#5249 ) * Global setting to select preferred storage pool Currently all the volumes are allocated on storage pools based on the capacity or the algorithm selected. Sometimes we need to deploy all volumes of particular account in a specific storage pool and in that case its not possible. with this change, we can specify the uuid of the preferred storage pool, so that all volumes of the account will be deployed in this pool * code feedback Co-authored-by: Rakesh Venkatesh <rakeshv@apache.org>	2021-08-12 00:01:15 -03:00
Abhishek Kumar	50a16979c5	refactor: migrate vm with storage (#5030 ) * refactor: migrate with storage host capability check Refactors Boolean HypervisorCapabilitiesDao::isStorageMotionSupported to boolean HypervisorCapabilitiesDao::isStorageMotionSupported for simplifying callers. Refactors log messages. Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * simplify Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * refactor Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * changes Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * review comments addressed Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> * var rename Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2021-07-15 12:57:13 +05:30
Gabriel Beims Bräscher	a3cdd1f836	Allow deploy Admin VMs and VRs in disabled zones/pods/clusters (#3600 )	2021-05-28 10:45:30 +02:00
sureshanaparti	eba186aa40	storage: New Dell EMC PowerFlex Plugin (formerly ScaleIO, VxFlexOS) (#4304 ) Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag). Please find more details in the FS here: https://cwiki.apache.org/confluence/x/cDl4CQ Documentation PR: apache/cloudstack-documentation#169 This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack Other improvements addressed in addition to PowerFlex/ScaleIO support: - Added support for config drives in host cache for KVM => Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level => Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives => Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive => Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path => Maintain the config drive location and use it when required on any config drive operation (migrate, delete) - Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates - Updated full deployment destination for preparing the network(s) on VM start - Propagate the direct download certificates uploaded to the newly added KVM hosts - Discover the template size for direct download templates using any available host from the zones specified on template registration => When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones - Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period) - Retry VM deployment/start when the host cannot grant access to volume/template - Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand => Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore - Check the router filesystem is writable or not, before performing health checks => Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not => The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations. => Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not - Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s) * Addressed some issues for few operations on PowerFlex storage pool. - Updated migration volume operation to sync the status and wait for migration to complete. - Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM. - Added sync lock while spooling managed storage template before volume creation from the template (non-direct download). - Updated resize volume error message string. - Blocked the below operations on PowerFlex storage pool: -> Extract Volume -> Create Snapshot for VMSnapshot * Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires. - The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes. Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html Other fixes included: - Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up) - Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates. - Perform basic tests (for connectivity and file system) on router before updating the health check config data => Validate the basic tests (connectivity and file system check) on router => Cleanup the health check results when router is destroyed * Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0 * UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool. - PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage - Updated protocol to "custom" for PowerFlex provider - Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool and Minor improvements in PowerFlex/ScaleIO storage plugin code * Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances. - findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool. - Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance. - Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools. - Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported. * Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py * Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots) * Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support. * Fix to remove the duplicate zone wide pools listed while finding storage pools for migration * Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure * Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure	2021-02-24 14:58:33 +05:30
nvazquez	950292dcb0	Ensure deploy as is disks get allocated to the same storage pool	2020-10-19 15:05:58 +05:30
nvazquez	94bebe8792	Revert back deploy as is column on templates but keep it as default for new templates	2020-10-19 15:05:57 +05:30
nvazquez	9b51a706db	Set deploy-as-is to default on VMware	2020-10-19 15:05:57 +05:30
nvazquez	32d85b0fa2	Display storage on logging when not deploy-as-is and guest OS small refactor	2020-10-19 15:05:57 +05:30
Harikrishna Patnala	fb0a96e7fb	Check if datastore is complaince with the storagepolicy provided in the disk offering. Added corresponding manager objects from PBM sdk to do the job. Made dao layer changes to read the storage policy in diskoffering	2020-10-19 14:57:15 +05:30
Rohit Yadav	b3bafffff3	Merge remote-tracking branch 'origin/4.14'	2020-09-29 14:33:58 +05:30
Wei Zhou	98c51a6d3d	server: check guest os preference of last host when start a vm (#4338 ) If vm has last host_id specified, cloudstack will try to start vm on it at first. However, host tag is checked, but guest os preference is not checked. for new vm, it will be deployed to the preferred host as we expect. Fixes: #3554 (comment)	2020-09-29 12:45:29 +05:30
Spaceman1984	b586eb22f1	Human readable sizes in logs (#4207 ) This PR adds outputting human readable byte sizes in the management server logs, agent logs, and usage records. A non-dynamic global variable is added (display.human.readable.sizes) to control switching this feature on and off. This setting is sent to the agent on connection and is only read from the database when the management server is started up. The setting is kept in memory by the use of a static field on the NumbersUtil class and is available throughout the codebase. Instead of seeing things like: 2020-07-23 15:31:58,593 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1863645820801253428: Processing: { Ans: , MgmtId: 52238089807, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-224-VM","bytesSent":"106496","bytesReceived":"0","result":"true","details":"","wait":"0",}}] } The KB MB and GB values will be printed out: 2020-07-23 15:31:58,593 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1863645820801253428: Processing: { Ans: , MgmtId: 52238089807, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-224-VM","bytesSent":"(104.00 KB) 106496","bytesReceived":"(0 bytes) 0","result":"true","details":"","wait":"0",}}] } FS: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Human+Readable+Byte+sizes	2020-08-13 15:55:16 +05:30
Rohit Yadav	b141b8e256	Merge remote-tracking branch 'origin/4.13' into 4.14	2020-07-07 12:51:46 +05:30
Wei Zhou	4da374b6b4	server: Dedicated hosts should be 'Not Suitable' while find hosts for vm migration (#4001 ) While migrate a vm, in the popup, the host dedicated to other accounts/domains are also 'Suitable" for migration, which is obviously wrong. The same issue happens with api findHostsForMigration	2020-07-04 11:01:41 +05:30
pavanaravapalli	d4b537efa7	UEFI Implementation: Enabled UEFI Support for Guest VM's on Hypervisor KVM,VMware. enabled boot modes [Legacy,Secure] support for UEFI boot with known caveats. (#3638 ) Co-authored-by: Pavan Kumar Aravapalli <pavan_aravapalli@accelerite.com> Co-authored-by: dahn <daan.hoogland@shapeblue.com>	2020-03-13 20:56:26 +01:00
Nicolas Vazquez	efe00aa7e0	[KVM] Rolling maintenance (#3610 )	2020-03-12 16:59:46 +01:00

1 2

55 Commits