610 Commits

Author SHA1 Message Date
Suresh Kumar Anaparti
3220eb442a
PowerFlex/ScaleIO - MDM and host SDC connection enhancements (#11047)
* Cumulative enhancements fix for ScaleIO: MDM add/remove, Host prepare/unprepare, validate Storage Pool can be created in Agent.

- Implemented validation to fail Host disconnect from Storage Pool if there are Volumes attached and SDC client MDM removal requires scini service to be restarted
- Implemented Storage Pool validation by checking whether MDM addresses from configuration file and from memory (using CLI) matches, otherwise file ModifyStoragePool command.
- Introduced configuration key to apply timeout after making MDM changes for ScaleIO: powerflex.mdm.change.apply.timeout.ms (default 1000ms)
- Implemented logic to apply timeout after making MDM changes for ScaleIO in prepare and unprepare logic
- Added detection of MDM removal support via CLI
- If MDM removal support via CLI supported then use CLI, fall back to edit drv_cfg.txt and restart scini instead

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
2025-07-16 08:25:28 +02:00
slavkap
54b44cc316
KVM: Option to deploy a VM with existing volume/snapshot (#10503)
* Option to deploy a VM with existing volume/snapshot

* smoke test changes

check if the hypervisor is KVM
check if the primary storage's scope is ZONE wide

* skip all tests if the storage isn't Zone-Wide and the hypervisor isn't KVM

* support StorPool tags

add StorPool tags to a volume created from snapshot or to a volume which
will be attached as a ROOT to a new VM

* Add StorPool tags on the new ROOT volume

* Add the StorPool's tags when volume is created from a snapshot or a
volume is attached as a ROOT to a VM

* Addressed review
2025-07-14 15:10:45 +05:30
shrikantjoshi-hpe
4d46bece4a
fix priority for volume copy operation (#11109) 2025-07-14 07:50:58 +02:00
Nicolas Vazquez
6adfda2818
CKS Enhancements (#9102)
CKS Enhancements:

* Ability to specify different compute or service offerings for different types of CKS cluster nodes – worker, master or etcd

* Ability to use CKS ready custom templates for CKS cluster nodes

* Add and Remove external nodes to and from a kubernetes cluster

Co-authored-by: nvazquez <nicovazquez90@gmail.com>

* Update remove node timeout global setting

* CKS/NSX : Missing variables in worker nodes

* CKS: Fix ISO attach logic

* CKS: Fix ISO attach logic

* address comment

* Fix Port - Node mapping when cluster is scaled in the presence of external node(s)

* CKS: Externalize control and worker node setup wait time and installation attempts

* Fix logger

* Add missing headers and fix end of line on files

* CKS Mark Nodes for Manual Upgrade and Filter Nodes to add to CKS cluster from the same network

* Add support to deploy CKS cluster nodes on hosts dedicated to a domain

---------

Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>

* Support unstacked ETCD

---------

Co-authored-by: nvazquez <nicovazquez90@gmail.com>

* Fix CKS cluster scaling and minor UI improvement

* Reuse k8s cluster public IP for etcd nodes and rename etcd nodes

* Fix DNS resolver issue

* Update UDP active monitor to ICMP

* Add hypervisor type to CKS cluster creation to fix CKS cluster creation when External hosts added

* Fix build

* Fix logger

* Modify hypervisor param description in the create CKS cluster API

* CKS delete fails when external nodes are present

* CKS delete fails when external nodes are present

* address comment

* Improve network rules cleanup on failure adding external nodes to CKS cluster

* UI: Fix etcd template was not honoured

* UI: Fix etcd template was not honoured

* Refactor

* CKS: Exclude etcd nodes when calculating port numbers

* Fix network cleanup in case of CKS cluster failure

* Externalize retries and inverval for NSX segment deletion

* Fix CKS scaling when external node(s) present in the cluster

* CKS: Fix port numbers displayed against ETCD nodes

* Add node version details to every node of k8s cluster - as we now support manual upgrade

* Add node version details to every node of k8s cluster - as we now support manual upgrade

* update column name

* CKS: Exclude etcd nodes when calculating port numbers

* update param name

* update param

* UI: Fix CKS cluster creation templates listing for non admins

* CKS: Prevent etcd node start port number to coincide with k8s cluster start port numbers

* CKS: Set default kubernetes cluster node version to the kubernetes cluster version on upgrade

* CKS: Set default kubernetes cluster node version to the kubernetes cluster version on upgrade

* consolidate query

* Fix upgrade logic

---------

Co-authored-by: nvazquez <nicovazquez90@gmail.com>

* Fix CKS cluster version upgrade

* CKS: Fix etcd port numbers being skipped

* Fix CKS cluster with etcd nodes on VPC

* Move schema and upgrade for 4.20

* Fix logger

* Fix after rebasing

* Add support for using different CNI plugins with CKS

* Add support for using different CNI plugins with CKS

* remove unused import

* Add UI support and list cni config API

* necessary UI changes

* add license

* changes to support external cni

* UI changes

* Fix NPE on restarting VPC with additional public IPs

* fix merge conflict

* add asnumber to create k8s svc layer

* support cni framework to use as-numbers

* update code

* condition to ignore undefined jinja template variables

* CKS: Do not pass AS number when network ID is passed

* Fix deletion of Userdata / CNI Configuration in projects

* CKS: Add CNI configuration details to the response and UI

* Explicit events for registering cni configuration

* Add Delete cni configuration API

* Fix CKS deployment when using VPC tiers with custom ACLs

* Fix DNS list on VR

* CKS: Use Network offering of the network passed during CKS cluster creation to get the AS number

* CKS cluster with guest IP

* Fix: Use control node guest IP as join IP for external nodes addition

* Fix DNS resolver issue

* Improve etcd indexing - start from 1

* CKS: Add external node to a CKS cluster deployed with etcd node(s) successfully

* CKS: Add external node to a CKS cluster deployed with etcd node(s) successfully

* simplify logic

* Tweak setup-kube-system script for baremetal external nodes

* Consider cordoned nodes while getting ready nodes

* Fix CKS cluster scale calculations

* Set token TTL to 0 (no expire) for external etcd

* Fix missing quotes

* Fix build

* Revert PR 9133

* Add calico commands for ens35 interface

* Address review comments: plan CKS cluster deployment based on the node type

* Add qemu-guest-agent dependency for kvm based templates

* Add marvin test for CKS clusters with different offerings per node type

* Remove test tag

* Add marvin test and fix update template for cks and since annotations

* Fix marvin test for adding and removing external nodes

* Fix since version on API params

* Address review comments

* Fix unit test

* Address review comments

* UI: Make CKS public templates visible to non-admins on CKS cluster creation

* Fix linter

* Fix merge error

* Fix positional parameters on the create kubernetes ISO script and make the ETCD version optional

* fix etcd port displayed

* Further improvements to CKS  (#118)

* Multiple nics support on Ubuntu template

* Multiple nics support on Ubuntu template

* supports allocating IP to the nic when VM is added to another network - no delay

* Add option to select DNS or VR IP as resolver on VPC creation

* Add API param and UI to select option

* Add column on vpc and pass the value on the databags for CsDhcp.py to fix accordingly

* Externalize the CKS Configuration, so that end users can tweak the configuration before deploying the cluster

* Add new directory to c8 packaging for CKS config

* Remove k8s configuration from resources and make it configurable

* Revert "Remove k8s configuration from resources and make it configurable"

This reverts commit d5997033ebe4ba559e6478a64578b894f8e7d3db.

* copy conf to mgmt server and consume them from there

* Remove node from cluster

* Add missing /opt/bin directory requrired by external nodes

* Login to a specific Project view

* add indents

* Fix CKS HA clusters

* Fix build

---------

Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>

* Add missing headers

* Fix linter

* Address more review comments

* Fix unit test

* Fix scaling case for the same offering

* Revert "Login to a specific Project view"

This reverts commit 95e37563f48573780b07a038a7f48c0bc04e9b64.

* Revert "Fix CKS HA clusters" (#120)

This reverts commit 8dac16aa359faa6500ea1e1ce548169cfd08331a.

* Apply suggestions from code review about user data

Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>

* Update api/src/main/java/org/apache/cloudstack/api/command/user/userdata/BaseRegisterUserDataCmd.java

Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>

* Refactor column names and schema path

* Fix scaling for non existing previous offering per node type

* Update node offering entry if there was an existing offering but a global service offering has been provided on scale

---------

Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
Co-authored-by: Daan Hoogland <daan@onecht.net>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-06-19 11:00:42 +05:30
slavkap
28ff19b751
enabled discard option (#10077)
Enable the discard option for virtio-blk and virtio-scsi devices for
volumes on StorPool storage
2025-06-14 11:20:47 +02:00
slavkap
685ee9e78f
StorPool: support for direct download (#9833) 2025-06-14 11:19:37 +02:00
slavkap
5c0346ea86
Adding device ID to a StorPool volume (#10587) 2025-06-11 19:39:51 +02:00
Wei Zhou
842b2f8c24
Merge remote-tracking branch 'apache/4.20' 2025-05-19 21:25:37 +02:00
Harikrishna
b17808bfba
Introducing Storage Access Groups for better management for host and storage connections (#10381)
* Introducing Storage Access Groups to define the host and storage pool connections

In CloudStack, when a primary storage is added at the Zone or Cluster scope, it is by default connected to all hosts within that scope. This default behavior can be refined using storage access groups, which allow operators to control and limit which hosts can access specific storage pools.

Storage access groups can be assigned to hosts, clusters, pods, zones, and primary storage pools. When a storage access group is set on a cluster/pod/zone, all hosts within that scope inherit the group. Connectivity between a host and a storage pool is then governed by whether they share the same storage access group.

A storage pool with a storage access group will connect only to hosts that have the same storage access group. A storage pool without a storage access group will connect to all hosts, including those with or without a storage access group.
2025-05-19 11:33:29 +05:30
Rene Peinthor
4259e0b51b
linstor: fix host connect recursion regression (#10878) 2025-05-16 12:37:37 +02:00
Suresh Kumar Anaparti
572fc11a64
[PowerFlex] Add & Remove PowerFlex/ScaleIO MDMs for the storage SDC connections (#9903)
* Add & Remove PowerFlex/ScaleIO MDMs while preparing & unpreparing the storage SDC connections (instead of start & stop scini)

* Add/Remove MDM IP addresses during Host connection/disconnection to/from storage pool when powerflex.connect.on.demand is false

* unit test fixes

* Don't remove MDM IPs from SDC when any volumes mapped to SDC

* Don't remove MDM IPs when other pools of same ScaleIO/PowerFlex cluster are connected

* rebase fixes

* update changes, to not remove/disconnect MDMs on maintenance

* import fixes after rebase
2025-05-15 12:42:13 +05:30
Suresh Kumar Anaparti
52d986081b
Updated Endpoint Selector to pick the Cluster in Enabled state (in addition to Host state) (#10757)
* Consider the clusters with allocation state 'Enabled' for EndPoint selection (in addition to Host state)

* Reset the pool id when create volume fails on the allocated pool

- the pool id is persisted while creating the volume, when it fails the pool id is not reverted. On next create volume attempt, CloudStack couldn't find any suitable primary storage even there are pools available with enough capacity as the pool is already assigned to volume which is in Allocated state (and storage pool compatibility check fails). Ensure volume is not assigned to any pool if create volume fails (so the next creation job would pick the suitable pool).

* endpoint check for resize

* update the resize error through callback result instead of exception

* logger fix
2025-05-13 17:48:49 +05:30
Daan Hoogland
64828f66e8 Merge branch '4.20' 2025-05-13 13:34:23 +02:00
Daan Hoogland
dd84c74e82 Merge branch '4.19' into 4.20 2025-05-13 11:41:36 +02:00
Rene Peinthor
88ce639255
Linstor: implement volume and storage stats (#10850) 2025-05-13 10:06:35 +02:00
João Jandre
6fdaf51ddc
KVM incremental snapshot feature (#9270)
* KVM incremental snapshot feature

* fix log

* fix merge issues

* fix creation of folder

* fix snapshot update

* Check for hypervisor type during parent search

* fix some small bugs

* fix tests

* Address reviews

* do not remove storPool snapshots

* add support for downloading diff snaps

* Add multiple zones support

* make copied snapshots have normal names

* address reviews

* Fix in progress

* continue fix

* Fix bulk delete

* change log to trace

* Start fix on multiple secondary storages for a single zone

* Fix multiple secondary storages for a single zone

* Fix tests

* fix log

* remove bitmaps when deleting snapshots

* minor fixes

* update sql to new file

* Fix merge issues

* Create new snap chain when changing configuration

* add verification

* Fix snapshot operation selector

* fix bitmap removal

* fix chain on different storages

* address reviews

* fix small issue

* fix test

---------

Co-authored-by: João Jandre <joao@scclouds.com.br>
2025-05-12 10:50:30 -03:00
Pearl Dsilva
1e5d133033 Merge branch '4.20' of https://github.com/apache/cloudstack 2025-05-12 13:12:09 +05:30
slavkap
17e062a381
StorPool notify libvirt when volume is resized (#10775) 2025-05-09 11:34:52 +02:00
Wei Zhou
fd74895ad0
New feature: Reconcile commands (CopyCommand, MigrateCommand, MigrateVolumeCommand) (#10514) 2025-05-02 09:15:03 +02:00
Daan Hoogland
d7d9d131b2 Merge branch '4.20' 2025-05-01 15:44:09 +02:00
Abhisar Sinha
dfd64b1a67
Ceph object store: Fix LocationConstraint error (#10772)
* Don't set signingRegion as auto for creating the s3 client in ceph object store provider.

* replace getBucketAcl with doesBucketExistV2 in CephObjectStoreDriverImplTest
2025-05-01 11:47:18 +05:30
Pearl Dsilva
576b97ba22 Merge branch '4.20' of https://github.com/apache/cloudstack 2025-04-24 09:22:40 +05:30
slavkap
f6f33c6add
Fix the size of a template downloaded from secondary storage (#10662)
Fixing the size of a template that is downloaded from secondary storage
to StorPool
2025-04-23 16:07:47 +05:30
Eric Kalendra
55c811547a
Update Mockito to 5.16.1 (#10686)
Dependency name change mockito-inline to mockito-core. Inline is now the default and the last version of mockito-inline released is 5.2.0.

assertj-core in user-authenticators/saml2 pulls in an incompatible version of byte-buddy and required an exclusion. Updating the version of assertj is left for a future PR.
The upgrade requires Java 11+, dropping support for Java 8. CloudStack documentation already says to use Java 11 and does not indicate that java 8 is supported.

Test classes using @RunWith(MockitoJUnitRunner.class) now get run in strict mode. Changes were made to tests where the stubbing intention was clear. In ManagementServerMaintenanceManagerImplTest there are 5 tests where the intention of the test is unclear. Each of the statements now use Mockito.lenient() to avoid the exception. Other cases in the tests follow a similar pattern.
Minor clean up.

Both @Spy and Mockito.spy( should not be used. Favored the annotation.
Both @RunWith(MockitoJUnitRunner.class) and MockitoAnnotations.openMocks(this); should not be used. Favored the annotation.
Unnecessary extends TestCase removed.
@InjectMocks and new in statement unnecessary. Removed new when issue presented.
Some of the Cmd classes like UpdateNetworkCmd have a type tree that includes fields of type Object. This appears to cause issues with injection, requiring that @Mock fields be available. This is where the following fields were added in multiple places:
Object job;
ResponseGenerator _responseGenerator;
Wrong number of parameters for Mockito.when in LibvirtRevertSnapshotCommandWrapperTest.java
2025-04-16 18:10:28 +05:30
John Bampton
f206137f83
docs: fixes grammar and spelling in Markdown files only (#10656) 2025-04-08 12:44:14 +02:00
Bryan Lima
cb4848bc1a
Add support to RBD erasure code pools (#9808)
* Readd filename string on qemuimg create

* Remove empty object on the data pool details of storage pools with no data pool

* Only use the method createPhysicalDiskByLibVirt with RBD when the pool is of erasure code type. Also added javadoc for createPhysicalDisk method

* Change literal '/' string to File.separator

* Add support for erasure code pools

* Fix null on putAll
2025-04-02 08:19:00 -03:00
Daan Hoogland
8af021c6f6 Merge branch '4.20' 2025-03-27 17:03:13 +01:00
Daan Hoogland
5f93ce71bb Merge branch '4.19' into 4.20 2025-03-27 16:44:42 +01:00
Rene Peinthor
f4a7c8ab89
linstor: implement missing deleteDatastore (#10561)
Somehow deleteDatastore was never implemented, that meant:
templates haven't been cleaned up on datastore delete and
also agents have never been informed about storage pool removal.
2025-03-18 08:50:19 -04:00
Daan Hoogland
9c6f2a9e14 Merge release branch 4.20 to main
* 4.20:
  Fix Stats Collector to not divide by zero (#10492)
  linstor: try to delete -rst resource before snapshot backup (#10443)
2025-03-12 11:31:56 +01:00
Daan Hoogland
f8adedc280 Merge release branch 4.19 to 4.20
* 4.19:
  linstor: try to delete -rst resource before snapshot backup (#10443)
2025-03-12 11:31:16 +01:00
Rene Peinthor
95c24810ab
linstor: try to delete -rst resource before snapshot backup (#10443)
If a -rst resource wasn't deleted because of a failed copy,
a reoccurring snapshot attempt couldn't be done, because there
was still the "old" -rst resource. To prevent this always
try to remove the -rst resource before, if it doesn't exist it is a noop.
2025-03-10 16:23:01 +01:00
Thomas O'Dowd
d94aaa8b59
Add Cloudian HyperStore Object Storage (#9748) 2025-03-10 07:38:40 +01:00
Abhishek Kumar
1c1dad977e Merge remote-tracking branch 'apache/4.20' 2025-03-06 09:55:27 +05:30
slavkap
9b8c862f9f
removing the usage of volumeFreeze StorPool API call (#8575) 2025-03-03 16:03:15 +01:00
Daan Hoogland
4a3686297d Updating pom.xml version numbers for release 4.19.3.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-25 10:43:11 +01:00
Daan Hoogland
4e321d4356 Updating pom.xml version numbers for release 4.19.2.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-20 09:32:07 +01:00
Abhisar Sinha
2a4a1f73d0
Support multi-scope configuration settings (#10300)
This PR introduces the concept of multi-scope configuration settings. In addition to the Global level, currently all configurations can be set at a single scope level.
It will be useful if a configuration can be set at multiple scopes. For example, a configuration set at the domain level
will apply for all accounts, but it can be set for an account as well. In which case the account level setting will override the domain level setting.

This is done by changing the column `scope` of table `configuration` from string (single scope) to bitmask (multiple scopes).

```
public enum Scope {
    Global(null, 1),
    Zone(Global, 1 << 1),
    Cluster(Zone, 1 << 2),
    StoragePool(Cluster, 1 << 3),
    ManagementServer(Global, 1 << 4),
    ImageStore(Zone, 1 << 5),
    Domain(Global, 1 << 6),
    Account(Domain, 1 << 7);
```
Each scope is also assigned a parent scope. When a configuration for a given scope is not defined but is available for multiple scope types, the value will be retrieved from the parent scope. If there is no parent scope or if the configuration is defined for a single scope only, the value will fall back to the global level.

Hierarchy for different scopes is defined as below :
- Global
    - Zone
        - Cluster
            - Storage Pool
        - Image Store
    - Management Server
    - Domain
        - Account

This PR also updates the scope of the following configurations (Storage Pool scope is added in addition to the existing Zone scope):
- pool.storage.allocated.capacity.disablethreshold
- pool.storage.allocated.resize.capacity.disablethreshold
- pool.storage.capacity.disablethreshold

Doc PR : https://github.com/apache/cloudstack-documentation/pull/476

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-02-14 11:25:01 +05:30
Wei Zhou
42a77c7646
LinstorStorageAdaptor: fix lint error (#10378)
This is found in some PRs

plugins/storage/volume/linstor/src/main/java/com/cloud/hypervisor/kvm/storage/LinstorStorageAdaptor.java:510: poperties ==> properties
2025-02-13 09:12:05 +01:00
Daan Hoogland
0dcb8da03a Merge branch '4.20' 2025-02-12 16:54:05 +01:00
Daan Hoogland
4f3e8e8c5a Merge branch '4.19' into 4.20 2025-02-12 15:00:51 +01:00
Rene Glover
3337f425ff
Primera pure patches & various small fixes (#10132)
Co-authored-by: GLOVER RENE <rg9975@cs419-mgmtserver.rg9975nprd.app.ecp.att.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-02-07 13:19:34 +01:00
Abhisar Sinha
a7beaaf73b
Add Resource Limits to Backups and Object Storage (#10017)
Doc PR : https://github.com/apache/cloudstack-documentation/pull/461
This PR fixes https://github.com/apache/cloudstack/issues/8638

== Description

Four new Resource Types have been added. Admin can configure corresponding resource limits for the tenants at different levels (domain, account, project) 
User dashboard's Storage section will show the new resources, their limits and current usage.

1. backup - No. of backups used by the account
2. backup_storage - Backup storage allocated for the account
3. bucket - No. of buckets used by the accounts
4. object_storage - Object storage allocated for the account.

Some other related changes done to BnR framework:

1. Maximum number of Backups to retain can be specified while creating Backup schedules, similar to Scheduled snapshots.
2. Oldest Scheduled backup of the same interval type will be deleted once the number reaches the configured max Backups value.
3. Code refactor: Moved syncBackups method from BackupProvider to the framework BackupManagerImpl, as it is a common functionality and all providers were using duplicated code.

Changes done to the Object Storage Framework

1. Quota parameter is made mandatory while creating a bucket. Bucket quota is considered to be the allocated space and will be used to enforce Resource limits.

== Schema Changes:

1. New Column `max_backups` added to `backup_schedule` table
4. New Column `backup_interval_type` added to `backups` table

== Api Changes:

1. createBackup: new Parameter `scheduleid`. It should be specified whenever a scheduled backup is created. This will translate to the `backup_interval_type` in the `backups` table.
3. createBackupScheduke: new Parameter `max_backups`. To specify maximum number of backups to retain for the given schedule.

== Configurations:

|Setting |Scope |Default Value |Description|
|-------|--------|--------------|-----------|
|backup.max.hourly |Global |8 |Maximum recurring hourly backups to be retained for an instance|
|backup.max.daily |Global |8 |Maximum recurring daily backups to be retained for an instance|
|backup.max.weekly |Global |8 |Maximum recurring weekly backups to be retained for an instance|
|backup.max.monthly |Global |8 |Maximum recurring monthly backups to be retained for an instance|
|max.account.backups| Global| 20 | The default maximum number of backups that can be created for an account|
|max.account.backup.storage| Global| 400 | The default maximum backup storage space (in GiB) that can be used for an account|
|max.domain.backups| Global| 40 | The default maximum number of backups that can be created for an domain|
|max.domain.backup.storage| Global| 800 | The default maximum backup storage space (in GiB) that can be used for an domain|
|max.project.backups| Global| 20 | The default maximum number of backups that can be created for an project|
|max.project.backup.storage| Global| 400 | The default maximum backup storage space (in GiB) that can be used for an project|

|Setting |Scope |Default Value |Description|
|-------|--------|--------------|-----------|
|max.account.buckets| Global| 20 | The default maximum number of buckets that can be created for an account|
|max.account.object.storage| Global| 400 | The default maximum object storage space (in GiB) that can be used for an account|
|max.domain.buckets| Global| 40 | The default maximum number of buckets that can be created for an domain|
|max.domain.object.storage| Global| 800 | The default maximum object storage space (in GiB) that can be used for an domain|
|max.project.buckets| Global| 20 | The default maximum number of buckets that can be created for an project|
|max.project.object.storage| Global| 400 | The default maximum object storage space (in GiB) that can be used for an project|


Co-authored-by: Daan Hoogland <daan@onecht.net>
Co-authored-by: Lucas Martins <56271185+lucas-a-martins@users.noreply.github.com>
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-07 16:56:20 +05:30
Rene Peinthor
df99a29483
linstor: Fix using multiple primary storage with same linstor-controller (#10280) 2025-02-06 10:18:04 +01:00
Rene Peinthor
55e8eaab89
Linstor: encryption support (#10126)
This introduces a new encryption mode, instead of a simple bool.
Now also storage driver can just provide encrypted volumes to CloudStack.
2025-02-04 15:18:49 +01:00
Daan Hoogland
2654890e86 Merge branch '4.20' 2025-02-01 21:20:08 +01:00
Abhishek Kumar
0b5a5e8043
api,agent,server,engine-schema: scalability improvements (#9840)
* api,agent,server,engine-schema: scalability improvements

Following changes and improvements have been added:

- Improvements in handling of PingRoutingCommand

    1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
    2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
    3. Optimized scanning stalled VMs

- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`

- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine

- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.

- Added caching for account/use role API access with expiration after write set to 60 seconds.

- Added caching for some recurring DB retrievals

    1. CapacityManager - listing service offerings - beneficial in host capacity calculation
    2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
    3. DownloadListener - hypervisors for zone - beneficial for host joins
    5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands

- Optimized MS list retrieval for agent connect

- Optimize finding ready systemvm template for zone

- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks

- Changes in agent-agentmanager connection with NIO client-server classes

    1. Optimized the use of the executor service
    2. Refactore Agent class to better handle connections.
    3. Do SSL handshakes within worker threads
    5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
    6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.

- Improvements in StatsCollection - minimize DB retrievals.

- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.

- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.

- Minor improvements in resource limit calculations wrt DB retrievals

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* test1, domaindetails, capacitymanager fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* test2 - agent tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* capacitymanagertest fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* change

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix missing changes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert marvin/setup.py

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix indent

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* use space in sql

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address duplicate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* update host logs

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert e36c6a5d07

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix npe in capacity calculation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* move schema changes to 4.20.1 upgrade

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix build

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add some more tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* checkstyle fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary mocks

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* replace statics

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* engine/orchestration,utils: limit number of concurrent new agent
connections

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor - remove unused

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unregister closed connections, monitor & cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add check for outdated vm filter in power sync

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* agent: synchronize sendRequest wait

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-01 12:28:41 +05:30
Daan Hoogland
81e052cfeb Merge release branch 4.20 to main
* 4.20:
  linstor: Fix ZFS snapshot backup (#10219)
  fix listing of VMs by network (#10204)
  Configure org.eclipse.jetty.server.Request.maxFormKeys from server.properties and increase the default value (#10214)
  api: fix access for listSystemVmUsageHistory (#10032)
  Fix NPE issues during host rolling maintenance, due to host tags and custom constrained/unconstrained service offering (#9844)
2025-01-21 12:00:19 +01:00
Daan Hoogland
5167c3b613 Merge branch '4.19' into 4.20 2025-01-21 11:59:43 +01:00
Rene Peinthor
1ff68cf9b1
linstor: Fix ZFS snapshot backup (#10219)
Linstor plugin used the wrong zfs dataset path to hide/unhide
the snapshot device.
Also don't use the full path to the zfs binary.
2025-01-21 15:40:17 +05:30