595 Commits

Author SHA1 Message Date
Suresh Kumar Anaparti
6d16ac2113
ScaleIO/PowerFlex smoke tests improvements, and some fixes (#11554)
* ScaleIO/PowerFlex smoke tests improvements, and some fixes

* Fix test_volumes.py, encrypted volume size check (for powerflex volumes)

* Fix test_over_provisioning.py (over provisioning supported for powerflex)

* Update vm snapshot tests

* Update volume size delta in primary storage resource count for user vm volumes only
The VR volumes resource count for PowerFlex volumes is updated here, resulting in resource count discrepancy
(which is re-calculated through ResourceCountCheckTask later, and skips the VR volumes)

* Fix test_import_unmanage_volumes.py (unsupported for powerflex)

* Fix test_sharedfs_lifecycle.py (volume size check for powerflex)

* Update powerflex.connect.on.demand config default to true
2025-09-12 16:17:20 +02:00
shrikantjoshi-hpe
90681df1b5
Primera: Delete session after key expiration (#11487) 2025-09-08 09:44:33 +02:00
Suresh Kumar Anaparti
9111bbd8da
Merge branch '4.19' into 4.20 2025-08-15 19:49:59 +05:30
Rene Peinthor
25f93b1d6b
linstor: fix getVolumeStats if multiple Linstor primary storages are used (#11397)
We didn't account for caching the volume stats for each used Linstor
cluster, so the first asked Linstor cluster would prevent caching
for all the others and so null was returned.

Now we have invalidate counters for each Linstor cluster and
also store the cache result with the Linstor cluster address prefixed.
2025-08-15 19:20:39 +05:30
Suresh Kumar Anaparti
a2d35c8ac2
Fix imports 2025-08-04 17:49:38 +05:30
Suresh Kumar Anaparti
7acd5a3875
Merge branch '4.19' into 4.20 2025-08-04 16:42:49 +05:30
Abhishek Kumar
3134efb971
plugin-swift: handle null cache store (#11380)
Fixes https://github.com/apache/cloudstack/pull/11315#pullrequestreview-3074036751

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-08-04 16:21:20 +05:30
Nicolas Vazquez
ed0d606e98
Find system VM templates for CKS clusters and SharedFS honouring the preferred architecture (#10946)
* Find system VM templates for CKS cluster honouring the preferred architecture

* Fix unit tests

* Fix checkstyle

* Sort instead of filtering by preferred arch

* Remove unnecesary stubs

* Restore java version

* Address review comments

* Fail and display error message in case the CKS ISO arch doesnt match the selected template arch

* Prefer CKS ISO arch instead of the system VM setting
2025-07-31 16:42:47 +05:30
Daan Hoogland
609efcc231 Merge branch '4.19' into 4.20 2025-07-25 22:01:17 +02:00
ghernadi
a4263da8ae
linstor: Use template's uuid if pool's downloadPath is null as resource-name (#11053)
Also added an integration test for templates from snapshots
2025-07-25 07:51:11 -04:00
Abhisar Sinha
d72a05aa5a
Add special Icon to Shared FileSystem Instances (#10857)
* Use special icon for sharedfs instance and prefix for sharedfs volumes

* Give custom icon precedence over shared fs icon

* Fix sharedfsvm icon size

* Fix UT failure in StorageVmSharedFSLifeCycleTest
2025-07-23 11:21:59 +05:30
Suresh Kumar Anaparti
d5f6b7cd1d
Fix to create instances with smaller templates (< 1 GB) on PowerFlex/ScaleIO storage (#11211)
* Fix to create instances with smaller templates (< 1 GB) on PowerFlex/ScaleIO storage

* code improvements
2025-07-22 21:36:26 +05:30
Suresh Kumar Anaparti
c94f75c7ea
PowerFlex/ScaleIO - Wait after SDC service start/restart/stop, and retry to fetch SDC id/guid (#11099)
* [PowerFlex/ScaleIO] Added wait time after SDC service start/restart/stop, and retries to fetch SDC id/guid

* Added agent property 'powerflex.sdc.service.wait' for the time (in secs) to wait after SDC service start/restart/stop

* code improvements
2025-07-16 12:32:09 +05:30
Pearl Dsilva
379ee07d88 Updating pom.xml version numbers for release 4.19.4.0-SNAPSHOT
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-06-06 18:00:09 +05:30
Pearl Dsilva
b5e2c181f9 Updating pom.xml version numbers for release 4.20.2.0-SNAPSHOT
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-06-06 15:38:12 +05:30
Pearl Dsilva
c61a5eb430 Updating pom.xml version numbers for release 4.20.1.0
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-05-30 12:43:00 +05:30
Daan Hoogland
0c7d47138d Updating pom.xml version numbers for release 4.19.3.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-05-30 09:08:58 +02:00
Rene Peinthor
4259e0b51b
linstor: fix host connect recursion regression (#10878) 2025-05-16 12:37:37 +02:00
Suresh Kumar Anaparti
112dfddd40
Reset the pool id when create volume fails on the allocated pool, and update the resize error when no endpoint exists (#10777)
* Reset the pool id when create volume fails on the allocated pool

- the pool id is persisted while creating the volume, when it fails the pool id is not reverted. On next create volume attempt, CloudStack couldn't find any suitable primary storage even there are pools available with enough capacity as the pool is already assigned to volume which is in Allocated state (and storage pool compatibility check fails). Ensure volume is not assigned to any pool if create volume fails (so the next creation job would pick the suitable pool).

* endpoint check for resize

* update the resize error through callback result instead of exception
2025-05-16 10:26:28 +02:00
Suresh Kumar Anaparti
52d986081b
Updated Endpoint Selector to pick the Cluster in Enabled state (in addition to Host state) (#10757)
* Consider the clusters with allocation state 'Enabled' for EndPoint selection (in addition to Host state)

* Reset the pool id when create volume fails on the allocated pool

- the pool id is persisted while creating the volume, when it fails the pool id is not reverted. On next create volume attempt, CloudStack couldn't find any suitable primary storage even there are pools available with enough capacity as the pool is already assigned to volume which is in Allocated state (and storage pool compatibility check fails). Ensure volume is not assigned to any pool if create volume fails (so the next creation job would pick the suitable pool).

* endpoint check for resize

* update the resize error through callback result instead of exception

* logger fix
2025-05-13 17:48:49 +05:30
Daan Hoogland
dd84c74e82 Merge branch '4.19' into 4.20 2025-05-13 11:41:36 +02:00
Rene Peinthor
88ce639255
Linstor: implement volume and storage stats (#10850) 2025-05-13 10:06:35 +02:00
slavkap
17e062a381
StorPool notify libvirt when volume is resized (#10775) 2025-05-09 11:34:52 +02:00
Abhisar Sinha
dfd64b1a67
Ceph object store: Fix LocationConstraint error (#10772)
* Don't set signingRegion as auto for creating the s3 client in ceph object store provider.

* replace getBucketAcl with doesBucketExistV2 in CephObjectStoreDriverImplTest
2025-05-01 11:47:18 +05:30
slavkap
f6f33c6add
Fix the size of a template downloaded from secondary storage (#10662)
Fixing the size of a template that is downloaded from secondary storage
to StorPool
2025-04-23 16:07:47 +05:30
Daan Hoogland
5f93ce71bb Merge branch '4.19' into 4.20 2025-03-27 16:44:42 +01:00
Rene Peinthor
f4a7c8ab89
linstor: implement missing deleteDatastore (#10561)
Somehow deleteDatastore was never implemented, that meant:
templates haven't been cleaned up on datastore delete and
also agents have never been informed about storage pool removal.
2025-03-18 08:50:19 -04:00
Daan Hoogland
f8adedc280 Merge release branch 4.19 to 4.20
* 4.19:
  linstor: try to delete -rst resource before snapshot backup (#10443)
2025-03-12 11:31:16 +01:00
Rene Peinthor
95c24810ab
linstor: try to delete -rst resource before snapshot backup (#10443)
If a -rst resource wasn't deleted because of a failed copy,
a reoccurring snapshot attempt couldn't be done, because there
was still the "old" -rst resource. To prevent this always
try to remove the -rst resource before, if it doesn't exist it is a noop.
2025-03-10 16:23:01 +01:00
slavkap
9b8c862f9f
removing the usage of volumeFreeze StorPool API call (#8575) 2025-03-03 16:03:15 +01:00
Daan Hoogland
4a3686297d Updating pom.xml version numbers for release 4.19.3.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-25 10:43:11 +01:00
Daan Hoogland
4e321d4356 Updating pom.xml version numbers for release 4.19.2.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-20 09:32:07 +01:00
Wei Zhou
42a77c7646
LinstorStorageAdaptor: fix lint error (#10378)
This is found in some PRs

plugins/storage/volume/linstor/src/main/java/com/cloud/hypervisor/kvm/storage/LinstorStorageAdaptor.java:510: poperties ==> properties
2025-02-13 09:12:05 +01:00
Daan Hoogland
4f3e8e8c5a Merge branch '4.19' into 4.20 2025-02-12 15:00:51 +01:00
Rene Glover
3337f425ff
Primera pure patches & various small fixes (#10132)
Co-authored-by: GLOVER RENE <rg9975@cs419-mgmtserver.rg9975nprd.app.ecp.att.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-02-07 13:19:34 +01:00
Rene Peinthor
df99a29483
linstor: Fix using multiple primary storage with same linstor-controller (#10280) 2025-02-06 10:18:04 +01:00
Rene Peinthor
55e8eaab89
Linstor: encryption support (#10126)
This introduces a new encryption mode, instead of a simple bool.
Now also storage driver can just provide encrypted volumes to CloudStack.
2025-02-04 15:18:49 +01:00
Abhishek Kumar
0b5a5e8043
api,agent,server,engine-schema: scalability improvements (#9840)
* api,agent,server,engine-schema: scalability improvements

Following changes and improvements have been added:

- Improvements in handling of PingRoutingCommand

    1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
    2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
    3. Optimized scanning stalled VMs

- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`

- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine

- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.

- Added caching for account/use role API access with expiration after write set to 60 seconds.

- Added caching for some recurring DB retrievals

    1. CapacityManager - listing service offerings - beneficial in host capacity calculation
    2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
    3. DownloadListener - hypervisors for zone - beneficial for host joins
    5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands

- Optimized MS list retrieval for agent connect

- Optimize finding ready systemvm template for zone

- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks

- Changes in agent-agentmanager connection with NIO client-server classes

    1. Optimized the use of the executor service
    2. Refactore Agent class to better handle connections.
    3. Do SSL handshakes within worker threads
    5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
    6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.

- Improvements in StatsCollection - minimize DB retrievals.

- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.

- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.

- Minor improvements in resource limit calculations wrt DB retrievals

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* test1, domaindetails, capacitymanager fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* test2 - agent tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* capacitymanagertest fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* change

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix missing changes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert marvin/setup.py

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix indent

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* use space in sql

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address duplicate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* update host logs

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert e36c6a5d07

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix npe in capacity calculation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* move schema changes to 4.20.1 upgrade

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix build

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add some more tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* checkstyle fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary mocks

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* replace statics

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* engine/orchestration,utils: limit number of concurrent new agent
connections

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor - remove unused

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unregister closed connections, monitor & cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add check for outdated vm filter in power sync

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* agent: synchronize sendRequest wait

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-01 12:28:41 +05:30
Daan Hoogland
5167c3b613 Merge branch '4.19' into 4.20 2025-01-21 11:59:43 +01:00
Rene Peinthor
1ff68cf9b1
linstor: Fix ZFS snapshot backup (#10219)
Linstor plugin used the wrong zfs dataset path to hide/unhide
the snapshot device.
Also don't use the full path to the zfs binary.
2025-01-21 15:40:17 +05:30
Vishesh
a4224e58cc
Improve logging to include more identifiable information (#9873)
* Improve logging to include more identifiable information for kvm plugin

* Update logging for scaleio plugin

* Improve logging to include more identifiable information for default volume storage plugin

* Improve logging to include more identifiable information for agent managers

* Improve logging to include more identifiable information for Listeners

* Replace ids with objects or uuids


* Improve logging to include more identifiable information for engine

* Improve logging to include more identifiable information for server

* Fixups in engine

* Improve logging to include more identifiable information for plugins

* Improve logging to include more identifiable information for Cmd classes

* Fix toString method for StorageFilterTO.java
2025-01-06 16:42:37 +05:30
Daan Hoogland
b7f0aac519 Merge branch '4.19' into 4.20 2024-12-20 14:34:39 +01:00
Rene Peinthor
a9587bfd2e
kvm-storage: provide isVMMigrate information to storage plugins (#10093)
Particular Linstor needs can use this information to only allow
dual volume access for live migration and not enable it in general,
which can and will lead to data corruption if for some reason
2 VMs get started on 2 different hosts.
2024-12-18 09:13:41 +01:00
Rene Peinthor
a2f2e87c12
linstor: improve heartbeat check with also asking linstor (#10105)
If a node doesn't have a DRBD connection to another node,
additionally ask Linstor-Controller if the node is alive.
Otherwise we would have simply said no and the node might still be alive.
This is always the case in a non hyperconverged setup.
2024-12-16 09:59:57 +01:00
Daan Hoogland
da54234585 Merge branch '4.19' into 4.20.merge 2024-12-03 16:32:15 +01:00
Rene Peinthor
d54b105a03
Linstor: add support for ISO block devices and direct download (#9792) 2024-11-28 17:47:47 +01:00
João Jandre
c63c7ee63e Updating pom.xml version numbers for release 4.20.1.0-SNAPSHOT
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-27 11:40:45 -03:00
João Jandre
2fe3fcef7c Updating pom.xml version numbers for release 4.20.0.0
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-19 08:54:07 -03:00
João Jandre
b38ee63c48 Merge branch '4.19' 2024-11-13 10:47:24 -03:00
Rene Peinthor
dfe4a67859
kvm: ref-count secondary storage pool usage (#9498)
If a secondary storage pool is used by e.g.
2 concurrent snapshot->template actions,
if the first action finished it removed the netfs mount
point for the other action.
Now the storage pools are usage ref-counted and will only
deleted if there are no more users.
2024-11-13 10:32:46 -03:00