1074 Commits

Author SHA1 Message Date
Abhishek Kumar
abbc61c01e
engine-orchestration: expunge destroyed system vm volume (#9197)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-06-13 10:00:22 +02:00
Vishesh
6b4955affe
Fix message publish in transaction (#8980)
* Fix message publish in transaction

* Resolve comments
2024-05-07 13:27:31 +05:30
Vishesh
80a8b80a9d
Update volume's passphrase to null if diskOffering doesn't support encryption (#8904) 2024-04-29 12:18:09 +05:30
SadiJr
96ae479000
[Usage] Create network billing (#7236)
Co-authored-by: Bryan Lima <bryan.lima@hotmail.com>
Co-authored-by: SadiJr <sadi@scclouds.com.br>
Co-authored-by: Bryan Lima <42067040+BryanMLima@users.noreply.github.com>
Co-authored-by: Henrique Sato <henriquesato2003@gmail.com>
2024-04-24 08:52:49 +02:00
Wei Zhou
0b857def68
New feature: Import/Unamange DATA volume from storage pool (#8808) 2024-04-23 16:05:59 +02:00
Vishesh
44aa08c02a
Fixup 4.19 build issue (#8905) 2024-04-12 16:37:25 +02:00
Vishesh
b998e7dbb6
Allow overriding root disk offering & size, and expunge old root disk while restoring a VM (#8800)
* Allow overriding root diskoffering id & size while restoring VM

* UI changes

* Allow expunging of old disk while restoring a VM

* Resolve comments

* Address comments

* Duplicate volume's details while duplicating volume

* Allow setting IOPS for the new volume

* minor cleanup

* fixup

* Add checks for template size

* Replace strings for IOPS with constants

* Fix saveVolumeDetails method

* Fixup

* Fixup UI styling
2024-04-12 17:47:52 +05:30
Vishesh
730cc5d5b8
Change iops on offering change (#8872)
* Change IOPS on disk offering change

* Remove iops & bandwidth limits before copying template

* minor refactor

* Handle diskOfferingDetails

* Fixup
2024-04-11 17:01:55 +05:30
Abhishek Kumar
ffd59720dd
storage,plugins: delegate allow zone-wide volume migration check and access grant check to storage drivers (#8762)
* storage,plugins: delegate allow zone-wide volume migration check and access grant to storage drivers

Following checks have been delegated to storage drivers,
- For volumes on zone-wide storage, whether they need storage migration when VM is migrated
- Whther volume required grant access

Apply fixes in resolving PrimaryDataStore

* add tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unused import

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* Update engine/orchestration/src/test/java/org/apache/cloudstack/engine/orchestration/VolumeOrchestratorTest.java

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-03-18 17:28:14 +05:30
Harikrishna
c462be1412
New API "checkVolume" to check and repair any leaks or issues reported by qemu-img check (#8577)
* Introduced a new API checkVolumeAndRepair that allows users or admins to check and repair if any leaks observed.
Currently this is supported only for KVM

* some fixes

* Added unit tests

* addressed review comments

* add repair volume while granting access

* Changed repair parameter to accept both leaks/all

* Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations

* Added volume check and repair changes only during VM start and volume attach operations

* Refactored the names to look similar across the code

* Some code fixes

* remove unused code

* Renamed repair values

* Fixed unit tests

* changed version

* Address review comments

* Code refactored

* used volume name in logs

* Changed the API to Async and the setting scope to storage pool

* Fixed exit value handling with check volume command

* Fixed storage scope to the setting

* Fix volume format issues

* Refactored the log messages

* Fix formatting
2024-02-29 14:41:49 +05:30
Daan Hoogland
f4987bf8ee Merge release branch 4.18 to 4.19
* 4.18:
  Storage plugin support to check if volume on datastore requires access for migration (#8655)
  CKS: fix /opt/bin/deploy-cloudstack-secret in CKS control nodes (#8697)
2024-02-26 15:53:11 +01:00
Suresh Kumar Anaparti
f731fe882c
Storage plugin support to check if volume on datastore requires access for migration (#8655)
* Check if volume on datastore requires access for migration, and grant/revoke volume access if requires

* Updated default implementation for requiresAccessForMigration method in PrimaryDataStoreDriver
2024-02-26 20:16:31 +05:30
Vishesh
1a1131154e
Fixup vm powerstate update (#8545)
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-02-19 13:56:21 +01:00
dahn
a0e592e945
prevent nic removal on out of bounds router stop (#8371)
Co-authored-by: Vishesh <vishesh92@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
2024-02-16 14:33:22 +01:00
Suresh Kumar Anaparti
f702f7f57c
Remove sensitive params (VmPassword, etc) from VMWork log (#8553) 2024-02-05 13:26:18 +05:30
Abhishek Kumar
a7b97ff3b0 Updating pom.xml version numbers for release 4.19.1.0-SNAPSHOT
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-02-02 18:06:04 +05:30
Abhishek Kumar
2746225b99 Updating pom.xml version numbers for release 4.19.0.0
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-01-29 10:21:52 +05:30
Vishesh
fedcf66de0
Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)
This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)
2024-01-27 23:36:13 +05:30
kishankavala
80bbb29abf
CleanUp Async Jobs after mgmt server maintenance (#8394)
This PR fixes moves resources stuck in transition state during async job cleanup

Problem:
During maintenance of the management server, other servers in the cluster or the same server after a restart initiate async job cleanup. However, this process leaves resources in a transitional state. The only recovery option currently available is to make direct database changes.

Solution:
This PR introduces a resolution by changing Volume, Virtual Machine, and Network resources from their transitional states. This adjustment enables the reattempt of failed operations without the need for manual database modifications.
2024-01-19 13:26:25 +05:30
Vishesh
c3b77cb7b8
Fix host stuck in connecting state (#8502)
There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.
2024-01-15 13:56:34 +05:30
Abhishek Kumar
3936f7c2cf
vm-import: kvm import and fix volume size when lesser than 1GiB (#8500)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Daan Hoogland <daan@onecht.net>
2024-01-12 13:32:02 +01:00
Nicolas Vazquez
a3a4833c3e
Fixes for KVM unmanaged instances import on advanced network and VNC password (#8492)
This PR fixes a regression caused by #8465 on advanced zones, import fails with:

2024-01-10 12:13:33,234 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-991bbe9f job-128 ctx-f49517d4) (logid:d7b8e716) Allocating nic for vm 142272e8-9e2e-407b-9d7e-e9a03b81653c in network Network {"id": 204, "name": "Isolated", "uuid": "9679fac5-e3ac-4694-a57b-beb635340f39", "networkofferingid": 10} during import
2024-01-10 12:13:33,239 ERROR [o.a.c.v.UnmanagedVMsManagerImpl] (API-Job-Executor-3:ctx-991bbe9f job-128 ctx-f49517d4) (logid:d7b8e716) Failed to import NICs while importing vm: i-2-31-VM
com.cloud.exception.InsufficientVirtualNetworkCapacityException: Unable to acquire Guest IP  address for network Network {"id": 204, "name": "Isolated", "uuid": "9679fac5-e3ac-4694-a57b-beb635340f39", "networkofferingid": 10}Scope=interface com.cloud.dc.DataCenter; id=1
	at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.importNic(NetworkOrchestrator.java:4582)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importNic(UnmanagedVMsManagerImpl.java:859)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importVirtualMachineInternal(UnmanagedVMsManagerImpl.java:1198)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importUnmanagedInstanceFromHypervisor(UnmanagedVMsManagerImpl.java:1511)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.baseImportInstance(UnmanagedVMsManagerImpl.java:1342)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importUnmanagedInstance(UnmanagedVMsManagerImpl.java:1282)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Also, addresses the VNC password field set instead of a fixed string
2024-01-12 14:14:01 +05:30
Nicolas Vazquez
b8d3e342be
Fix KVM import unmanaged instances on basic zone (#8465)
This PR fixes import unmanaged instances on KVM basic zones, on top of #8433

Fixes: #8439: point 1
2024-01-10 13:21:00 +05:30
Abhishek Kumar
26214ea139 Merge remote-tracking branch 'apache/4.18' 2023-12-21 20:55:38 +05:30
Wei Zhou
9d3a7be4dd
server: fix debug message when expunge a vm (#8374)
This PR fixes the debug message when expunge a vm
2023-12-21 14:17:57 +05:30
John Bampton
dda672503f
Remove unneeded duplicate words (#8358)
This PR removes some unneeded duplicate words.
2023-12-15 17:13:32 +05:30
kishankavala
ab20b1220f
KVM Ingestion - Import Instance (#7976)
This PR adds new functionality to import KVM instances from an external host or from disk images in local or shared storage.
Doc PR: https://github.com/apache/cloudstack-documentation/pull/356
2023-12-14 13:08:56 +05:30
Abhishek Kumar
82f7abddb3 Merge remote-tracking branch 'apache/4.18' 2023-12-13 11:24:15 +05:30
Bryan Lima
3bb318bab9
kvm: Add support for cgroupv2 (#8252)
1. Problem description

In Apache CloudStack (ACS), when a VM is deployed in a host with the KVM hypervisor, an XML file is created in the assigned host, which has a property shares that defines the weight of the VM to access the host CPU. The value of this property has no unit, and it is a relative measure to calculate how much CPU a given VM will have in the host. However, this value has a limit, which depends on the version of cgroup utilized by the host's kernel. The problem lies at the range value of shares that varies between both versions: [2, 264144] for cgroups version 1; and [1, 10000] for cgroups version 2. Currently, ACS calculates the value of shares using Equation 1, presented below, where CPU is the number of cores and speed is the CPU frequency; both specified in the VM's compute offering. Therefore, if a compute offering has, for example, 6 cores at 2 GHz, the shares value will be 12000 and an exception will be thrown by libvirt if the host utilizes cgroup v2. The second version is becoming the default one in current Linux distributions; thus, it is necessary to address this limitation.

    Equation 1
    shares = CPU * speed

Fixes: #6744
2. Proposed changes

To address the problem described, we propose to apply a scale conversion considering the max shares of the host. Using the same formula currently utilized by ACS, it is possible to calculate the maximum shares of a VM for a given host. In other words, using the number of cores and the nominal speed of the host's CPU as the upper limit of shares allowed to a VM. Then, this value will be scaled to the allowed interval of [1, 10000] of cgroup v2 by using a linear scale conversion.

The VM shares would be calculated as Equation 2, presented below, where VM requested shares is the requested shares value calculated using Equation 1, cgroup upper limit is fixed with a value of 10000 (cgroups v2 upper limit), and host max shares is the maximum shares value of the host, calculated using Equation 1. Using Equation 2, the only case where a VM passes the cgroup v2 limit is when the user requests more resources than the host has, which is not possible with the current implementation of ACS.

    Equation 2
    shares = (VM requested shares * cgroup upper limit)/host max shares

To implement the proposal, the following APIs will be updated: deployVirtualMachine, migrateVirtualMachine and scaleVirtualMachine. When a VM is being deployed, a new verification will be added to find a suitable host. The max shares of each host will be calculated, and the VM calculated shares will be verified if it does not surpass the host's value. Likewise, the migration of VMs will have a similar new verification. Lastly, the scale of VMs will also have the same verification for the VM's host.

To determine the max shares of a given host, we will use the same equation currently used in ACS for calculating the shares of VMs, presented in Section 1. When Equation 1 is used to determine the maximum shares of a host, CPU is the number of cores of the host, and speed is the nominal CPU speed, i.e., considering the CPU's base frequency.

It is important to note that these changes are only for hosts with the KVM hypervisor using cgroup v2 for now.
2023-12-13 10:51:24 +05:30
Rene Glover
1031c31e6a
FiberChannel Multipath for KVM + Pure Flash Array and HPE-Primera Support (#7889)
This PR provides a new primary storage volume type called "FiberChannel" that allows access to volumes connected to hosts over fiber channel connections. It requires Multipath to provide path discovery and failover. Second, the PR adds an AdaptivePrimaryDatastoreProvider that abstracts how volumes are managed/orchestrated from the connector to communicate with the primary storage provider, using a ProviderAdapter interface, allowing the code interacting with the primary storage provider API's to be simpler and have no direct dependencies on Cloudstack code. Lastly, the PR provides an implementation of the ProviderAdapter classes for the HP Enterprise Primera line of storage solutions and the Pure Flash Array line of storage solutions.
2023-12-09 11:31:33 +05:30
Nicolas Vazquez
371ad9f55b
New Feature: Import VMware VMs into KVM (#7881)
This PR adds the capability in CloudStack to convert VMware Instances disk(s) to KVM using virt-v2v and import them as CloudStack instances. It enables CloudStack operators to import VMware instances from vSphere into a KVM cluster managed by CloudStack. vSphere/VMware setup might be managed by CloudStack or be a standalone setup.

    CloudStack will let the administrator select a VM from an existing VMware vCenter in the CloudStack environment or external vCenter requesting vCenter IP, Datacenter name and credentials.
    The migrated VM will be imported as a KVM instance
    The migration is done through virt-v2v: https://access.redhat.com/articles/1351473, https://www.ovirt.org/develop/release-management/features/virt/virt-v2v-integration.html
    The migration process timeout can be set by the setting convert.instance.process.timeout
    Before attempting the virt-v2v migration, CloudStack will create a clone of the source VM on VMware. The clone VM will be removed after the registration process finishes.
    CloudStack will delegate the migration action to a KVM host and the host will attempt to migrate the VM invoking virt-v2v. In case the guest OS is not supported then CloudStack will handle the error operation as a failure
    The migration process using virt-v2v may not be a fast process
    CloudStack will not perform any check about the guest OS compatibility for the virt-v2v library as indicated on: https://access.redhat.com/articles/1351473.
2023-12-07 12:59:56 +05:30
Vishesh
d3917ef8f6
Remove powermock form VM Manager test (#8199) 2023-11-08 14:21:32 -03:00
Daan Hoogland
3d7281d451 Merge branch '4.18' into main 2023-11-08 14:39:50 +01:00
Vishesh
e65c9ffe70
Fix: Select another pod if all hosts in the pod becomes unavailable (#8085) 2023-11-07 15:11:00 +01:00
Abhishek Kumar
256892b3f8 build: fix failure after forward merge
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2023-11-03 17:28:20 +05:30
Daan Hoogland
a15cb81c85 Merge remote-tracking branch 'apache/4.18' into main 2023-11-03 11:55:26 +01:00
Vishesh
5362bad442
Storage Management (#7949) 2023-11-01 10:46:22 +01:00
Vishesh
3b11663d87
Fix failure on agent reconnection (#8089) 2023-10-26 16:54:36 +02:00
Abhishek Kumar
543c54c718
api,server,ui: snapshot copy, multi-zone replica (#7873)
This PR adds new functionality to copy snapshots across zones and take snapshots for multiple zones.

Copy functionality is similar to template copy. The source zone acts as the web server from where the destination zone(s) can download the snapshot files. For this purpose, a new API - `copySnapshot` has been added. The response for copySnapshot will be returning zone and download details from the first destination zone of the request. This behaviour is similar to the `copyTemplate` API.

In a similar manner, multiple zones can be selected while taking the snapshots or creating snapshot policies. For this snapshot will be taken in the base zone(in which volume is present) and then copied to the additional zones. A new parameter - `zoneids` has been added to `createSnapshot` and `createSnapshotPolicy` APIs.

As snapshots can be present on multiple zones (secondary stores), a new parameter `zoneid` has been added to delete the snapshot copy on a specific zone.

`listSnapshots` API has been updated to allow listing snapshot entries for different zones/datastores. New parameters - `showUnique`, `locationType` have been added.

Events generated during snapshot operations will now be linked to the snapshot itself rather than the volume of the snapshot.

`listSnapshotPolicies` and `createSnapshotPolicy` APIs will return zone details of the zones in which backup will be scheduled for the policy.

----
New API added
`copySnapshot`

Request and response params updated for APIs
```
- listSnapshots
- deleteSnapshot
- createTemplate
- listZones
- listSnapshotPolicies
- createSnapshotPolicy
```
UI updated for
- Snapshot detail view
- Create snapshot form
- Create snapshot policy form
- Create volume (from snapshot) form
- Create template (from snapshot) form

Doc PR: https://github.com/apache/cloudstack-documentation/pull/344
PR: https://github.com/apache/cloudstack/pull/7873
2023-10-23 09:01:58 +02:00
Abhishek Kumar
99ded8169b Merge remote-tracking branch 'apache/4.18' into main 2023-10-20 17:40:19 +05:30
sato03
a8700bff7f
server: set Default NIC when VM has no default NIC (#7859)
Co-authored-by: Henrique Sato <henrique.sato@scclouds.com.br>
2023-10-20 11:40:10 +02:00
Marcus Sorensen
3694667f50
Trigger out of band VM state update via libvirt event when VM stops (#7963)
* Trigger out of band VM state update via libvirt event when VM stops

* Add License headers, refactor nested try

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-28 12:12:03 +05:30
Marcus Sorensen
155a30748c
Allow configkey to set 'cloud-name' cloud-init metadata (#7964)
* Allow configkey to set 'cloud-name' cloud-init metadata

* Update engine/api/src/main/java/com/cloud/vm/VirtualMachineManager.java

Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>

* Update server/src/main/java/com/cloud/network/NetworkModelImpl.java

Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>

* Update server/src/main/java/com/cloud/network/router/CommandSetupHelper.java

Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>

* Revert "Update server/src/main/java/com/cloud/network/router/CommandSetupHelper.java"

This reverts commit 8abc3e38c4f973227e77253a8bcf3c94827d187f.

* Revert "Update server/src/main/java/com/cloud/network/NetworkModelImpl.java"

This reverts commit 7f239be919112913d764b141e943acca4473f755.

* Rework/Fix review code suggestions

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
2023-09-26 14:51:11 +05:30
Rohit Yadav
30e34ef310 Merge remote-tracking branch 'origin/4.18' 2023-09-26 14:43:23 +05:30
Pearl Dsilva
951ba04cf0
VR live patching: Allow live patch of VPC VRs even if networks are in allocated / shutdown state (#7958) 2023-09-25 20:34:23 +05:30
Daan Hoogland
f539c4b81a Merge release branch 4.18 to main
* 4.18:
  Publish event for VM.STOP when out of band stop is detected (#7878)
2023-09-25 10:20:57 +02:00
Marcus Sorensen
3071ad69f6
Publish event for VM.STOP when out of band stop is detected (#7878)
Signed-off-by: Marcus Sorensen <mls@apple.com>
2023-09-25 10:02:08 +02:00
John Bampton
4eb110af73
Remove unneeded duplicate words (#7850) 2023-09-18 13:16:33 +02:00
Wei Zhou
246bb24b0f Updating pom.xml version numbers for release 4.18.2.0-SNAPSHOT
Signed-off-by: Wei Zhou <weizhou@apache.org>
2023-09-12 17:26:53 +02:00
Wei Zhou
4bdff06acd Updating pom.xml version numbers for release 4.18.1.0
Signed-off-by: Wei Zhou <weizhou@apache.org>
2023-09-07 08:50:50 +02:00