405 Commits

Author SHA1 Message Date
dahn
aca8732102
[router] make a distinction between fatal errors, warnings and unknown as healthcheck result (#10710)
* [routers] distiction between fatal failure and warning or unknown on healthchecks

* UI status for router health checks

* status from scripts varied

* automation signalled errors

* revert removal of update sql

* upgradeversion

* move config item and further cleanup

* handling services better

* backwards compatible response

---------

Co-authored-by: Daan Hoogland <dahn@apache.org>
2025-09-22 11:39:05 +05:30
Suresh Kumar Anaparti
1033be4b31
Updating pom.xml version numbers for release 4.22.0.0-SNAPSHOT
Signed-off-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-08-28 12:00:42 +05:30
Suresh Kumar Anaparti
f9513b47bf
Updating pom.xml version numbers for release 4.21.0.0
Signed-off-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-08-22 11:42:37 +05:30
slavkap
e5f61164b3
Support of snapshot copy to primary storage in different zones. (#9478)
* Support of snapshot copy to different StorPool primary storage between zones
2025-08-04 16:35:16 +05:30
Abhisar Sinha
a87c5c2b3a
Create new Instance from VM backup (#10140)
This feature adds the ability to create a new instance from a VM backup for dummy, NAS and Veeam backup providers. It works even if the original instance used to create the backup was expunged or unmanaged. There are two parts to this functionality:
Saving all configuration details that the VM had at the time of taking the backup. And using them to create an instance from backup.
Enabling a user to expunge/unmanage an instance that has backups.
2025-07-31 15:47:22 +05:30
Vishesh
f6ad184ea2
Feature: Add support for GPU with KVM hosts (#11143)
This PR allows attaching of GPU devices via PCI, mdev or VF to an Instance for KVM.

It allows the operator to discover the GPU devices on the KVM host and create a Compute Offering with GPU support based on the available GPU devices on the host. Once the operator has created the Compute offering, it can be used by users to launch Instances with GPU devices.
2025-07-29 13:46:24 +05:30
Harikrishna
cca8b2fef9
Extensions Framework & Orchestrate Anything (#9752)
The Extensions Framework in Apache CloudStack is designed to provide a flexible and standardised mechanism for integrating external systems and custom workflows into CloudStack’s orchestration process. By defining structured hook points during key operations—such as virtual machine deployment, resource preparation, and lifecycle events—the framework allows administrators and developers to extend CloudStack’s behaviour without modifying its core codebase.
2025-07-28 10:41:17 +05:30
Pearl Dsilva
0d4147f3f6
Netris Network Plugin Integration with CloudStack (#10458)
The Netris Plugin introduces Netris as a network service provider in CloudStack to be able to create and manage Virtual Private Clouds (VPCs) in CloudStack, being able to orchestrate the following network functionalities:

- Network segmentation with Netris-VXLAN isolation method
- Routing between "public" IP and network segments with an ACS ROUTED mode offering
- SourceNAT, DNAT, 1:1 NAT between "public" IP and network segments with an ACS NATTED mode offering
- Routing between VPC network segments (tiers in ACS nomenclature)
- Access Lists (ACLs) between VPC tiers and "public" network (TCP, UDP, ICMP) both as global egress rules and "public" IP specific ingress rules.
- ACLs between VPC network tiers (TCP, UDP, ICMP)
- External load balancing – between VPC network tiers and "public" IP
- Internal load balancing – between VPC network tiers
- CloudStack Virtual Router services (DHCP, DNS, UserData, Password Injection, etc…)
2025-07-25 15:26:42 +05:30
Abhishek Kumar
83bccead3d
schema, refactor: rename cloud.user_vm_details to cloud.vm_instance_details (#10736)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Co-authored-by: dahn <daan@onecht.net>
2025-07-24 12:08:29 +02:00
Manoj Kumar
e8ab0ae70a
CPU to Memory weight based algorithm to order cluster (#10997)
* CPU to Memory weight based algorithm to order cluster
host.capacityType.to.order.clusters config will support new algorithm: COMBINED
which will work with host.capacityType.to.order.clusters.cputomemoryweight and capacity will be
computed based on CPU and memory both and using weight factor

* minor changes

* add unit tests

* update desc and add validation

* handle copilot review comments

* add log indicating chosen capacityType for ordering

---------

Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-07-15 16:40:53 +05:30
Suresh Kumar Anaparti
be22bfe2c9
Management Server - Prepare for Maintenance and Cancel Maintenance improvements (#10995)
* Management Server - Prepare for Maintenance and Cancel Maintenance improvements:
- Added new setting 'management.server.maintenance.ignore.maintenance.hosts' to ignore hosts in maintenance states  while preparing management server for maintenance. This skips agent transfer and agents count check for hosts in maintenance.
- Rebalance indirect agents after cancel maintenance, using rebalance parameter in cancelMaintenance API
- Force maintenance after maintenance window timeout, using forced parameter in prepareForMaintenance API.
- Propagate 'indirect.agent.lb.check.interval' setting change to the host agents.

* rebases fixes

* code improvements, cleanup

* [UI] Set rebalance true by default in cancel maintenance dialog

* Update MS state after executing cluster cmd in the target MS, and some code improvements

* code improvements

* Ensure the host lb algorithm 'shuffle' is applied once before disabling the indirect agent lb check background task
2025-07-03 12:17:04 +05:30
Manoj Kumar
fa85a75bc8
Log previous and new value of configuration when reset/update API is called (#10769) 2025-06-04 12:06:25 +02:00
Harikrishna
b17808bfba
Introducing Storage Access Groups for better management for host and storage connections (#10381)
* Introducing Storage Access Groups to define the host and storage pool connections

In CloudStack, when a primary storage is added at the Zone or Cluster scope, it is by default connected to all hosts within that scope. This default behavior can be refined using storage access groups, which allow operators to control and limit which hosts can access specific storage pools.

Storage access groups can be assigned to hosts, clusters, pods, zones, and primary storage pools. When a storage access group is set on a cluster/pod/zone, all hosts within that scope inherit the group. Connectivity between a host and a storage pool is then governed by whether they share the same storage access group.

A storage pool with a storage access group will connect only to hosts that have the same storage access group. A storage pool without a storage access group will connect to all hosts, including those with or without a storage access group.
2025-05-19 11:33:29 +05:30
Suresh Kumar Anaparti
572fc11a64
[PowerFlex] Add & Remove PowerFlex/ScaleIO MDMs for the storage SDC connections (#9903)
* Add & Remove PowerFlex/ScaleIO MDMs while preparing & unpreparing the storage SDC connections (instead of start & stop scini)

* Add/Remove MDM IP addresses during Host connection/disconnection to/from storage pool when powerflex.connect.on.demand is false

* unit test fixes

* Don't remove MDM IPs from SDC when any volumes mapped to SDC

* Don't remove MDM IPs when other pools of same ScaleIO/PowerFlex cluster are connected

* rebase fixes

* update changes, to not remove/disconnect MDMs on maintenance

* import fixes after rebase
2025-05-15 12:42:13 +05:30
Daan Hoogland
d7d9d131b2 Merge branch '4.20' 2025-05-01 15:44:09 +02:00
Suresh Kumar Anaparti
9f229600e6
Add new config (non-dynamic) for agent connections monitor thread, and keep timeunit to secs (in sync with the earlier Wait config) (#10525) 2025-04-28 15:32:03 +02:00
Pearl Dsilva
2df1ac5106 Merge branch '4.20' of https://github.com/apache/cloudstack 2025-04-28 12:15:48 +05:30
Pearl Dsilva
0785ba046e Merge branch '4.19' of https://github.com/apache/cloudstack into 4.20 2025-04-28 11:10:08 +05:30
Fabricio Duarte
9d263cd71b
Network Usage event model adjustments (#10755) 2025-04-26 17:35:28 +02:00
Abhishek Kumar
12c077d704
api,ui: multi arch improvements (#10289) 2025-04-25 11:02:27 +02:00
Suresh Kumar Anaparti
9dceae4614
MS maintenance improvements (#10417)
* Update last agents during ms maintenance, and some code improvements

* Send 503 (Service Unavailable) response status when maintenance or shutdown is initiated
[Any load balancer in the clustered environment can avoid routing requests to this MS node]

* Migrate systemvm agents before routing host agents, and some code improvements

* Added events for ms maintenance and shutdown operations

* Added the following ms maintenance and shutdown improvements

- block new agent connections during prepare for maintenance of ms

- maintain avoids ms list

- propagate updated management servers list and lb algorithm in host and indirect.agent.lb.algorithm settings respectively, to systemvm (non-routing) agents

- updated setup ms list and migrate agent connections to executor service

- migrate agent connection through executor, and send the answer to the ms host that initiated the migration

- re-initialize ssl handshake executor if it is shutdown

- don't allow prepare for maintenance or shutdown when other management server nodes are in preparing states

- don't allow trigger shutdown when management server is up and other management server nodes are in preparing states

- stop agent connections monitor on ms maintenance

- update avoid ms list in ready command

- updated connected host from the client connection

- update last agents in ms metrics from the database

- updated some agent config descriptions

- update last management server in the hosts during shutdown

- added agents and lastagents in management server response

- updated management server maintenance & shutdown unit tests

- some code improvements

* refactored code / addressed comments

* removed shutdown testcase (maybe, calling System.exit)

* Revert "removed shutdown testcase (maybe, calling System.exit)"

This reverts commit e14b0717152ef6c8be102d61c80f42803a53172e.

* avoid system.exit during shutdown test

* code improvements

* testcase fix

* Fix cutoff time in agent connections monitor thread
2025-03-19 14:18:05 +05:30
Daan Hoogland
4a3686297d Updating pom.xml version numbers for release 4.19.3.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-25 10:43:11 +01:00
Daan Hoogland
4e321d4356 Updating pom.xml version numbers for release 4.19.2.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-20 09:32:07 +01:00
Abhisar Sinha
2a4a1f73d0
Support multi-scope configuration settings (#10300)
This PR introduces the concept of multi-scope configuration settings. In addition to the Global level, currently all configurations can be set at a single scope level.
It will be useful if a configuration can be set at multiple scopes. For example, a configuration set at the domain level
will apply for all accounts, but it can be set for an account as well. In which case the account level setting will override the domain level setting.

This is done by changing the column `scope` of table `configuration` from string (single scope) to bitmask (multiple scopes).

```
public enum Scope {
    Global(null, 1),
    Zone(Global, 1 << 1),
    Cluster(Zone, 1 << 2),
    StoragePool(Cluster, 1 << 3),
    ManagementServer(Global, 1 << 4),
    ImageStore(Zone, 1 << 5),
    Domain(Global, 1 << 6),
    Account(Domain, 1 << 7);
```
Each scope is also assigned a parent scope. When a configuration for a given scope is not defined but is available for multiple scope types, the value will be retrieved from the parent scope. If there is no parent scope or if the configuration is defined for a single scope only, the value will fall back to the global level.

Hierarchy for different scopes is defined as below :
- Global
    - Zone
        - Cluster
            - Storage Pool
        - Image Store
    - Management Server
    - Domain
        - Account

This PR also updates the scope of the following configurations (Storage Pool scope is added in addition to the existing Zone scope):
- pool.storage.allocated.capacity.disablethreshold
- pool.storage.allocated.resize.capacity.disablethreshold
- pool.storage.capacity.disablethreshold

Doc PR : https://github.com/apache/cloudstack-documentation/pull/476

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-02-14 11:25:01 +05:30
Daan Hoogland
2654890e86 Merge branch '4.20' 2025-02-01 21:20:08 +01:00
Abhishek Kumar
0b5a5e8043
api,agent,server,engine-schema: scalability improvements (#9840)
* api,agent,server,engine-schema: scalability improvements

Following changes and improvements have been added:

- Improvements in handling of PingRoutingCommand

    1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
    2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
    3. Optimized scanning stalled VMs

- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`

- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine

- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.

- Added caching for account/use role API access with expiration after write set to 60 seconds.

- Added caching for some recurring DB retrievals

    1. CapacityManager - listing service offerings - beneficial in host capacity calculation
    2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
    3. DownloadListener - hypervisors for zone - beneficial for host joins
    5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands

- Optimized MS list retrieval for agent connect

- Optimize finding ready systemvm template for zone

- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks

- Changes in agent-agentmanager connection with NIO client-server classes

    1. Optimized the use of the executor service
    2. Refactore Agent class to better handle connections.
    3. Do SSL handshakes within worker threads
    5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
    6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.

- Improvements in StatsCollection - minimize DB retrievals.

- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.

- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.

- Minor improvements in resource limit calculations wrt DB retrievals

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* test1, domaindetails, capacitymanager fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* test2 - agent tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* capacitymanagertest fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* change

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix missing changes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert marvin/setup.py

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix indent

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* use space in sql

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address duplicate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* update host logs

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert e36c6a5d07

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix npe in capacity calculation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* move schema changes to 4.20.1 upgrade

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix build

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add some more tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* checkstyle fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary mocks

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* replace statics

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* engine/orchestration,utils: limit number of concurrent new agent
connections

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor - remove unused

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unregister closed connections, monitor & cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add check for outdated vm filter in power sync

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* agent: synchronize sendRequest wait

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-01 12:28:41 +05:30
Suresh Kumar Anaparti
3b108b968f
Support for Management Server Maintenance Mode (#9854)
* Support for Management Server Maintenance

- New APIs: prepareForMaintenance and cancelMaintenance, with required parameter - managementserverid.

- New management server states for maintenance: PreparingForMaintenance, Maintenance.

- listHosts API with optional parameter – managementserverid, to list the hosts connected to the management server.

- Support management server maintenance when more than one active management servers available.

- Triggers transfer agents to other available management servers for maintenance, new agent command MigrateAgentConnectionCommand to initiate transfer of indirect agents.

- New global config 'management.server.maintenance.timeout', to set the timeout (in mins) for the management server maintenance window, default: 60 mins.

- UI changes: Prepare and Cancel Maintenance in Management Server section, Connected Agents tab, New fields for hosts and management servers.

* Updated pending jobs check timer task with ScheduledExecutorService

* keep maintenance state on trigger shutdown call when ms is in maintenance

* add pending jobs count to ms response

* during ms heartbeat, update state to up only when it's down

* allow vm work jobs of async job created before prepare for maintenance

* Revert "keep maintenance state on trigger shutdown call when ms is in maintenance"

This reverts commit 607e13364679eac897f4d146bb3325ea7a61ba17.

* skip maintenance test when multiple management servers are not available, and not configured in host setting for kvm
2025-01-29 13:31:15 +05:30
Daan Hoogland
048649d351 Merge release branch 4.20 to main
* 4.20:
  server: investigate pending HA work when executing in new MS session (#10167)
  extra null guard (#10264)
2025-01-28 14:34:19 +01:00
Abhishek Kumar
33a37da9ec
server: investigate pending HA work when executing in new MS session (#10167)
For HA work items that are created for host state change, checks must be
done when execution is called in a new management server session.

A new column, reason, has been added in cloud.op_ha_work table to track
the reason for HA work.

When HighAvailabilityManager starts it finds and puts all pending HA
work items in Investigating state. During execution of the HA work if it
is found in investigating state, checks are done to verify if the work
is still valid. If the jobs is found to be invalid it is cancelled.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-01-28 14:39:31 +05:30
Daan Hoogland
fadb39ece7 Merge release branch 4.20 to main
* 4.20:
  merge errors fixed
  Restrict the migration of volumes attached to VMs in Starting state (#9725)
  server, plugin: enhance storage stats for IOPS (#10034)
  Introducing granular command timeouts global setting (#9659)
  Improve logging to include more identifiable information (#9873)
2025-01-08 14:01:19 +01:00
Harikrishna
9bc283e5c2
Introducing granular command timeouts global setting (#9659)
* Introducing granular command timeouts global setting

* fix marvin tests

* Fixed log messages

* some more log message fix

* Fix empty value setting

* Converted the global setting to non-dynamic

* set wait on command only when granular wait is defined. This is to keep the backward compatibility

* Improve error logging
2025-01-07 17:06:32 +05:30
Vishesh
a4224e58cc
Improve logging to include more identifiable information (#9873)
* Improve logging to include more identifiable information for kvm plugin

* Update logging for scaleio plugin

* Improve logging to include more identifiable information for default volume storage plugin

* Improve logging to include more identifiable information for agent managers

* Improve logging to include more identifiable information for Listeners

* Replace ids with objects or uuids


* Improve logging to include more identifiable information for engine

* Improve logging to include more identifiable information for server

* Fixups in engine

* Improve logging to include more identifiable information for plugins

* Improve logging to include more identifiable information for Cmd classes

* Fix toString method for StorageFilterTO.java
2025-01-06 16:42:37 +05:30
Daan Hoogland
2daffa34f2 Merge release branch 4.20 to main
* 4.20:
  VR: fix site-2-site VPN if split connections is enabled (#10067)
  UI: fix cannot open 'Edit tags' modal for static routes (#10065)
  Update ownership selection component to be language independent (#10052)
  Support to enable/disable VM High Availability manager and related alerts (#10118)
2024-12-30 13:35:30 +01:00
Suresh Kumar Anaparti
330ed25a6c
Support to enable/disable VM High Availability manager and related alerts (#10118)
- Adds new config 'vm.ha.enabled'  with Zone scope, to enable/disable VM High Availability manager. This is enable by default (for backward compatibilty).
  When enabled, the VM HA WorkItems (for VM Stop, Restart, Migration, Destroy) can be created and the scheduled items are executed.
  When disabled, new VM HA WorkItems are not allowed and the scheduled items are retried until max retries configured at 'vm.ha.migration.max.retries' (executed in case HA is re-enabled during retry attempts), and then purged after 'time.between.failures' by the cleanup thread that runs regularly at 'time.between.cleanup'.
- Adds new config 'vm.ha.alerts.enabled' with Zone scope, to enable/disable alerts for the VM HA operations. This is enabled by default.
2024-12-26 17:45:32 +05:30
Wei Zhou
da94ae2c1c
Merge remote-tracking branch 'apache/4.20' 2024-12-03 09:44:42 +01:00
Abhisar Sinha
ef6c0c443d
Prepend VPC name to VPC network tier name (#9780)
* Fix `updateTemplatePermission` when the UI is set to a language other than English (#9766)

* Fix updateTemplatePermission UI in non-english language

* Improve fix

---------

Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>

* Prepend vpc name to vpc tier network name based on global setting

* Added UT for createVpcGuestNetwork

* rename connector to delimiter and add configKey.Category.Network

* Move setting the name to a new method

---------

Co-authored-by: Daan Hoogland <daan@onecht.net>
Co-authored-by: Lucas Martins <56271185+lucas-a-martins@users.noreply.github.com>
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
2024-12-03 12:06:00 +05:30
Wei Zhou
8a1da3804c
Resize volume: add pool capacity disablethreshold for resize and allow volume auto migration (#9761)
* server: add global settings for volume resize

* resizeVolume: support automigrate

* Address Suresh's comments

* Update api/src/main/java/org/apache/cloudstack/api/command/user/volume/ResizeVolumeCmd.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* address Suresh's comments

* UI: add autoMigrate to resizeVolume

* resizevolume: add unit tests

* resizevolume: add unit test for Allocated volume

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-12-02 10:28:14 +05:30
Lucas Martins
5886780240
Change vmsnapshot.max config to be dynamic (#9883)
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
2024-11-28 14:49:05 -03:00
João Jandre
d9774a8462 Updating pom.xml version numbers for release 4.21.0.0-SNAPSHOT
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-27 11:47:06 -03:00
João Jandre
c63c7ee63e Updating pom.xml version numbers for release 4.20.1.0-SNAPSHOT
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-27 11:40:45 -03:00
João Jandre
2fe3fcef7c Updating pom.xml version numbers for release 4.20.0.0
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-19 08:54:07 -03:00
Pearl Dsilva
f8d8a9c7b3
NSX Integration fixes (#8906)
* Prevent addition of duplicate PF rules on scale up and no rules left behind on scale down (#32)

* fix missing dependency injection

* NSX: Fix concurrency issues on port forwarding rules deletion (#37)

* Fix concurrency issues on port forwarding rules deletion

* Refactor objectExists

* Fix unit test

* Fix test

* Small fixes

* CKS: Externalize control and worker node setup wait time and installation attempts (#38)

* NSX: Add shared network support (#41)

* NSX: Fix number of physical networks for Guest traffic checks and leftover rules on CKS cluster deletion (#45)

* Fix pf rules removal on CKS cluster deletion

* Fix check for number of physical networks for guest traffic

* Fix unit test

* fix logger

* NSX: Handle CheckHealthCommand to avoid host disconnection and errors on APIs

* NSX: Handle CheckHealthCommand to avoid host disconnection and errors on APIs

* Remove unused string

* fix logger

* Update UDP active monitor to ICMP

* Fix NPE on restarting VPC with additional public IPs

* NSX / VPC: Reuse Source NAT IP from systemVM range on restarts

* CKS: Public IP not found for VPC networks

* Externalize retries and inverval for NSX segment deletion (#67)

* remove unused import

* remove duplicate imports

* remove unused import

* revert externalizing cks settings

* fix test

* Refactor log messages

* Address comments

* Fix issue caused due to forward merge: 90fe1d

---------

Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-09-06 16:56:50 -03:00
Wei Zhou
679ce1a639
feature: Dynamic and Static Routing (#9470)
This PR contains 3 features

- IPv4 Static Routing (Routed mode) #9346
Design document: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=306153967

- AS Numbers Management #9410
Design Document: https://cwiki.apache.org/confluence/display/CLOUDSTACK/BGP+AS+Numbers+Management


- Dynamic routing
Design Document: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=315492858

- Document: https://github.com/apache/cloudstack-documentation/pull/419

Rename nsx mode to routing mode

by
```
git grep -l nsx_mode  |xargs sed -i "s/nsx_mode/routing_mode/g"
git grep -l nsxmode  |xargs sed -i "s/nsxmode/routingmode/g"
git grep -l nsxMode  |xargs sed -i "s/nsxMode/routingMode/g"
git grep -l NsxMode  |xargs sed -i "s/NsxMode/RoutingMode/g"
```
- re-organize sql changes
- fix NPE as rules do not have public ip
- fix missing destination cidr in ingress rules
- disable network usage for routed network
- fix DB exception as network_id is -1 during network creation
- apply ingress/egress routing rules
- VR changes to configure nft rules for isolated network
- VR: setup nft rule for control network
- VR: flush all iptables rules
- fix NPE which is because ingress rules do not have public ip associated
- fix dest cidr is missing in nft tables
- add ip4 routing and ip4 routes to list network and list vpc response
- fix ingress rule is missing when vr is restarted
- fix icmp types in nft rules
- add tab to manage routing firewall rules
- fix ingress rules are not applied when VR is restarted
- add default rules in FORWARD chain
- fix create vpc offerings
- fix public ip is not assigned to vpc
- fix network offering is not listed when create vpc tier
- add is_routing to boot args of vpc vr
- remove table ip4_firewall in vpc vr
- release or remove subnet when remove a network
- implemenent fw_vpcrouter_routing
- fix wrong ip familty when flush ipv4 rules
- fix acl rules are not applied due to wrong version (should be 6 which means ip6 rules are removed)
- add default rules for vpc tiers so that tcp connections (e.g. ssh) work
- append policy rules after default rules
- remove /usr/local/cloud/systemvm/ in routers
- throw an exception when allocate subnet with cidrsize
- fix some TODOs
- add new parameters to update API
- return type Ipv4GuestSubnetNetworkMap when get or create subnet
- fix firewall rules are broken
- add domain_id and account_id to db
- add domain/account/project to ipv4 subnet response
- create ipv4 subnet for domain/account/project
- check conflict when update ipv4 subnet
- ui changes
- add parent subnet to response
- add list for ipv4 subnet
- implement some methods
- fix list subnets for guest networks by zoneid
- UI changes
- fix delete ipv4 subnet for network
- fix ipv4 subnet is set to zone guest network cidr if cidrsize is specified
- add zone info to response if parent subnet is null but network is not
- fix gateway/cidr is not set when create network with cidrsize
- fix order of nft rules in the VRs

* Routed v24

- add classes in marvin base.py

* Routed v25

- add test_01_subnet_zone
- fix dedicate to domain/account failure
- list subnets for network by keyword and subnet

* Routed v26: implement subnet auto-allocation

- add utils for split ip ranges into small subnets
- add utils to get start/end ip of a cidr
- implement subnet auto-generation
- add global settings

* Routed 27: add subnet for VPC

- add db column for vpc_id
- add db record for vpc
- remove db record when delete a vpc
- add checkConflicts methods
- remove duplicated settings
- check ipv4 cidr when create subnet

* Routed v28: update smoke tests

- update test_ipv4_routing.py
- search subnets by networkid

* Routed 29: fix vpc and add more tests

- fix createnetwork in vpc
- add vpc id/name to response
- fix zone id/name are not displayed in some cases
- add smoke test for vpc
- add smoke tests for failed cases
- add smoke test for connectivity checks
- marvin: add "-q" to ssh command

* Routed 31: ui and smoke tests

- UI: add link to network in list view
- add nftables rules check in VRs

* Routed 32: add chain OUTPUT and more rules

- fix the issue 80/443/8080 is not reachable from VR itself

```
2024-06-27 10:21:52,121 INFO     Executing: systemctl start cloud-password-server@172.31.1.1
2024-06-27 10:21:52,128 INFO     Service cloud-password-server@172.31.1.1 start
2024-06-27 10:21:52,129 INFO     Executing: ps aux
2024-06-27 10:24:02,175 ERROR    Failed to update password server due to: <urlopen error [Errno 110] Connection timed out>
```

* Routed: fix dns search from VMs in Isolated networks

* Routed: fix VPC dns issue due to gateway IP is missing in cloud.conf

This is caused by NSX integration, and fixed by
https://github.com/apache/cloudstack/pull/9102/

* Routed: rename routing_mode to network_mode

* Routed: replace centos5.5 template in smoke test as dhclient does not work in the vms

// this does not work
refer to https://dominikrys.com/posts/disable-udp-checksum-validation/#ignoring-udp-checksums-with-nftables
and
https://forum.openwrt.org/t/udp-checksum-with-nftables/161522/11

the vm should have checksum offloading disabled

* Routed: fix smoke test due to wrong cidrlist of egress rules and missing ingress rule from VR

* PR 9346: fix lint error schema-41910to42000.sql

* PR 9346: ui polish v1

* PR 9346: create VPC with cidrsize

* Routed: fix test failures with test_network_ipv6 and test_vpc_ipv6 due to 'ssh -q'

* Routed: fix /usr/local/cloud/systemvm/ are removed after SSVM/CPVM reboot

* Routed: fix IP of additional nics of VPC VR is not gateway

* PR 9346: fix cidrsize check when create VPC with cidrsize

* Routed: fix test/integration/smoke/test_ipv4_routing.py:279:16: E713 test for membership should be 'not in'

* PR9346: fix/Update api

* PR 9346: set response object name

* PR9346: UI refactor and small fixes

* PR9346: change return type of getNetworkMode

* PR9346: move IPv4 subnet to seperated tab

* PR9346: revert IpRangesTabGuest.vue back to original

* PR9346: fix remove ipv4 subnet on UI

* PR9346: fix test_ipv4_routing.py

* AS Number Range Management

* Create AS Number Range for a Zone

* Fix build

* Add ListASNRange and fix create ASN range

* Add List AS numbers

* Add UI for AS Numbers

* Fix UI and filter AS Numbers

* Add AS Number on Isolated network creation and refactor UI and response

* Release AS Number

* Add network offering new columns

* Add UI support to view and add AS number and configure network offering

* Automatically assign AS Number if not specify AS number

* update variable name

* Fix routing mode check

* UI: Only allow selecting AS number when routing mode is Dynamic and specifyAsNumber is true

* UI: Only pass AS number when supported by the network offering

* Release AS number on network deletion

* Add deleteASNRange command (#81)

* API: List ASNumbers by asnumber (#83)

---------

Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>

* AS number management extensions

* Support AS number on VPC tier creation based on the offering

* Fix delete AS Range

* Fix UI values

* UI: Minor fix for releasing AS number

* UI: Move management of AS Range to Zone details view

* Fix specify_as_number column in network_offering table to set the default false

* Add events for AS number operations

* Allow users to list AS Numbers and fix network form for Normal users

* Add AS number details to list networks response

* Fix Allocated time format

* Fix Allocated time format

* support in details view too

* Fix: Do not release AS number if acquired network requires AS number

* Fix: Do not release AS number if acquired network requires AS number

* Fix typo

* Fix allocated release

* Fix event type

* UI: Add Routing mode and Specify AS to the network offering details

* UI: Add Routing mode and Specify AS to the network offering details

* Address comment

* Fix release AS number of network deletion

* Fix release AS number of network deletion

* Fix

* Restore release to its place based on the boolean

* Rename boolean

* API: Add networkId as listASNumber parameter

* Add Network name to the search view filter for AS numbers

* Present allocated time in human readable format - Pubilc IP / AS Numbers

* Add account / domain filter for AS numbers

* Add support for AS numbers on VPC offerings

* Refactor AS number allocation to VPC and non VPC isolated networks

* Checkstyle

* Add support for AS numbers on VPC offerings

* extend vpc offering view and vpcoffering response

* merge https://github.com/shapeblue/cloudstack-playtika/pull/115 and change network_id of as_numbers to include vpc_id

* Display AS number of VPC tiers as the AS number of the VPC

* extend asnumber response and ui support

* improve UI and as number response to view VPC details

* List only dynamic offerings for vpc tiers with specify as numbers

* Fix release AS number

* Fix AS number displayed as 0 when no AS number assigned

* Fix VPC offering creation without specify AS

---------

Co-authored-by: nvazquez <nicovazquez90@gmail.com>

* Fix release AS number on VPC deletion

* Update server/src/main/java/com/cloud/dc/BGPServiceImpl.java

* Update server/src/main/java/com/cloud/dc/BGPServiceImpl.java

* Fix missing column on asnumber table

* Fix listASNumbers API to support vpcid and obtain AS number from vpc for tiers

* Prevent listing 0 AS number for VPC

* Fix create Isolated Network form

* Update server/src/main/java/com/cloud/network/vpc/VpcManagerImpl.java

* Update server/src/main/java/com/cloud/network/vpc/VpcManagerImpl.java

* Dynamic: move routingmode/specifyasn after networkmode in AddNetworkOffering.vue on UI

* Dynamic: fix ip4routing in network response

* Dynamic/systemvm: add FRR to systemvm template

* Dynamic: BGP peers (DB,VO,Dao)

* Dynamic: BGP peers (VR/server)

* Dynamic: v3

- remove BgpPeer class
- fix vpc vr has bgp peers of only 1 tier
- rename ip4_cidr to guest_ip4_cidr
- rename ip6_cidr to guest_ip6_cidr
- generate /etc/frr/frr.conf
- apply BGP peers on Dynamic-Routed network even if there is no BGP peers

* Dynamic v4: fix vpc vr

- fix duplicated guest cidr in frr.conf in vpc vr

todo
- restart frr / reload frr (reload will cause bgp session to Policy state)
- apis for bgp peers
- assign/release bgp peer from/to network

* Dynamic v5: add apis for bgp peers

* Dynamic v6: fix bugs

- set response object name
- remove required as number when update
- fix checks when update
- allow regular users to list bgp peers

* Dynamic v7: move apis to bgp sub-dir

* Dynamic v8: add tab for manage BGP peers on UI

* Dynamic v9: fix update bgp with same config

* Dynamiv v10: add changeBgpPeersForNetworkCmd

* Dynamic v11: create network with bgppeerids

- create network with bgppeerids
- add marvin classes
- add smoke tests
- remove uuid from bgp_peer_network_map
- fix created/removed in bgp_peer_network_map
- remove bgppeers when remove a network
- UI: fix delete bgp peer

* Dynamic v12: add test for vpc tiers

* Dynamic v13: bug fixes

- fix change BGP peers for network in Allocated state
- fix listing network returns removed record
- fix all vpc tiers have the same settings
- remove BGP peers as part of network removal
- remove FRR settings for vpc tiers without any BGP peers
- UI: fix no error msg when change BGP peers

* Dynamic v14: assign BGP Peers for VPC instead of VPC tiers

- create vpc with bgppeerids
- do not allow create/update vpc tier with bgppeerids
- apply all bgp peers when create/delete a vpc tier
- UI: change bgp peers for vpc
- test: update tests on vpc

* Dynamic: fix build errors after merging as number PR

* Dynamic: fix TODOs

* Dynamic: fix smoke test on VPC

* Allow creation of networks by users with as numbers

* Address review comments

* Move BGPService to bgp package and inject it on BaseCmd

* Revert changes for CKS and address more comments

* Display left side menu option for AS number only for root admin

* Dynamic: create/update BGP peer with details

refer to https://docs.frrouting.org/en/latest/bgp.html

* Dynamic: fix build error and remove access to ListBgpPeers cmd for regular users

* Dynamic: assign all zone BGP peers to user networks

* Dynamic: show BGP peer info of networks only for root admin

* AS number: disable specifyasnumber for non-NSX offerings

* Dynamic: pass bgppeer details to command and fix typo with ip6 addr

* Dynamic: list BGP peers by isdedicated, and fix change bgppeers for network/vpc

* Dynamic: add UI labels

* Dynamic: add bgp peers to vpc response

* Dynamic: list bgp peers by keyword, fix list by asnumber

* Dynamic: fix list bgppeers by keyword and db schema

* Dynamic: fix list bgppeers do not return dedicated peers

* Dynamic: update UI when create network/vpc offering

* Update server/src/main/java/com/cloud/configuration/ConfigurationManagerImpl.java

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* Update tools/marvin/setup.py

* Dynamic: network mode must be same when update a network with new offering

* Dynamic: add method networkModel.isAnyServiceSupportedInNetwork

* Dynamic: rename APIs and classes

* Dynamic: fix unit tests due to previous changes

* Dynamic: validateNetworkCidrSize when auto-create subnet

* Dynamic: check AS number overlap

* Dynamic: add ActionEvent

* Dynamic: small code optimization

* Dynamic: fix ui bugs after api rename

* Dynamic: add marvin and test for ASN ranges and AS numbers

* Dynamic: add account setting use.system.bgp.peers

also
- change the default value of routed.ipv4.vpc.max.cidr.size and routed.ipv4.vpc.min.cidr.size
- change the category of settings

* static: fix ui error when delete zone ipv4 subnets

* static: small UI polish

* Dynamic: throw exception when as number is required but not passed

* Dynamic: fix typo when create FRR directory which causes network deletion failures

* Dynamic: connect to ALL (or ALL dedicated) BGP peers if no BGP peer mapping for the network/vpc

* Dynamic: throw exception when as number is required for VPC but not passed

* Dynamic: list bgp peers by useSystemBgpPeers

* Dynamic: fix frr config in VPC VR when change bgp peers

* Dynamic: create frr config even if there is no VPC tiers

* Dynamic: list bgp peers by zoneid (required for account) and account

* Dynamic: only apply FRR config for vpc tiers with dynamic routing

* Dynamic: donot send commands to router if commands size is 0

* Dynamic: fix 'new IPv6 address is not valid' when update bgp peer without IPv6

* Dynamic: throw exception if fail to allocate AS number when create network/vpc with dynamic routing

* Dynamic: enable ipv6 unicast and 'ip nht resolve-via-default'

* Dynamic: delete network/vpc if fail to allocate AS number when create network/vpc with dynamic routing

* test: add unit tests for ASN APIs

* test: add unit tests for core module

* test: add unit tests for API responses

* test: add unit tests for BgpPeerTO

* test: add minor changes

* test: add tests for create/delete/update/list RoutingFirewallRuleCmd

* Static: show ip4 routes for vpc tiers

* test: fix smoke test failure caused by type change of as number

* test: add test for Ipv4SubnetForZoneCmd

* test: add test for Ipv4SubnetForGuestNetworkCmd and BgpPeerCmd

* UI: do not show redundant router when network mode is ROUTED as RVR is not supported

* UI: hide 'Conserve mode' when networkmode is ROUTED

* test: add unit tests for ListASNumbersCmdTest

* Static: remove allocated IPv4 subnet when delete a network or vpc

* test: add unit tests for BgpPeersRules

* Dynamic: set ipv4routing from network offering

* server: list as numbers and ipv4 subnets by keyword

* server: remove dedicated bgp peers and ipv4 subnets when delete an account or domain

* server: fix dedicated ipv4 subnet is allocated to other accounts

* UI: fix allocated time format

* server: ignore project is projectid is -1 so bgppeers/ipv4subnets works in project view

* UI: add project column to bgp peers and ipv4 subnets

* server: fix list AS numbers by domain admin or normal user

* server: fix network creation when ipv4 subnet is dedicated

* UI: polish network.js

* Dynamic: fix frr config for ipv6 routing

* Static routing: support cks cluster

* Static: get/create IPv4 subnet from dedicated subnets at first

* Dynamic: add BGP peers tab

* Static: remove redundant loops

* api: add since to api and response

* server: add unit tests

---------

Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-09-06 08:55:17 +05:30
Fabricio Duarte
ede39d8edc
Configuration to disable URL validation when registering templates/ISOs (#8751) 2024-08-27 16:12:31 -03:00
Suresh Kumar Anaparti
3faf7cd2f1
Updating pom.xml version numbers for release 4.19.2.0-SNAPSHOT
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-07-19 10:29:26 +05:30
Suresh Kumar Anaparti
9f4c895974
Updating pom.xml version numbers for release 4.19.1.0
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-07-15 17:19:29 +05:30
Vishesh
0ec7c72875
Merge branch '4.19' 2024-07-01 12:41:45 +05:30
Abhisar Sinha
063dc60114
Change storage pool scope from Cluster to Zone and vise versa (#8875)
* New feature: Change storage pool scope

* Added checks for Ceph/RBD

* Update op_host_capacity table on primary storage scope change

* Storage pool scope change integration test

* pull 8875 : Addressed review comments

* Pull 8875: remove storage checks, AbstractPrimayStorageLifeCycleImpl class

* Pull 8875: Fixed integration test failure

* Pull 8875: Review comments

* Pull 8875: review comments + broke changeStoragePoolScope into smaller functions

* Added UT for changeStoragePoolScope

* Rename AbstractPrimaryDataStoreLifeCycleImpl to BasePrimaryDataStoreLifeCycleImpl

* Pull 8875: Dao review comments

* Pull 8875: Rename changeStoragePoolScope.vue to ChangeStoragePoolScope.vue

* Pull 8875: Created a new smokes test file + A single warning msg in ui

* Pull 8875: Added cleanup in test_primary_storage_scope.py

* Pull 8875: Type in en.json

* Pull 8875: cleanup array in test_primary_storage_scope.py

* Pull:8875 Removing extra whitespace at eof of StorageManagerImplTest

* Pull 8875: Added UT for PrimaryDataStoreHelper and BasePrimaryDataStoreLifeCycleImpl

* Pull 8875: Added license header

* Pull 8875: Fixed sql query for vmstates

* Pull 8875: Changed icon plus info on disabled mode in apidoc

* Pull 8875: Change scope should not work for local storage

* Pull 8875: Change scope completion event

* Pull 8875: Added api findAffectedVmsForStorageScopeChange

* Pull 8875: Added UT for findAffectedVmsForStorageScopeChange and removed listByPoolIdVMStatesNotInCluster

* Pull 8875: Review comments + Vm name in response

* Pull 8875: listByVmsNotInClusterUsingPool was returning duplicate VM entries because of multiple volumes in the VM satisfying the criteria

* Pull 8875: fixed listAffectedVmsForStorageScopeChange UT

* listAffectedVmsForStorageScopeChange should work if the pool is not disabled

* Fix listAffectedVmsForStorageScopeChangeTest UT

* Pull 8875: add volume.removed not null check in VmsNotInClusterUsingPool query

* Pull 8875: minor refactoring in changeStoragePoolScopeToCluster

* Update server/src/main/java/com/cloud/storage/StorageManagerImpl.java

* fix eof

* changeStoragePoolScopeToZone should connect pool to all Up hosts

Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2024-06-29 10:03:34 +05:30
Suresh Kumar Anaparti
2ca1b474bd
PowerFlex/ScaleIO SDC client connection improvements (#9268)
* Mitigation for non-scalable Powerflex/ScaleIO clients
- Added ScaleIOSDCManager to manage SDC connections, checks clients limit, prepare and unprepare SDC on the hosts.
- Added commands for prepare and unprepare storage clients to prepare/start and stop SDC service respectively on the hosts.
- Introduced config 'storage.pool.connected.clients.limit' at storage level for client limits, currently support for Powerflex only.

* tests issue fixed

* refactor / improvements

* lock with powerflex systemid while checking connections limit

* updated powerflex systemid lock to hold till sdc preparation

* Added custom stats support for storage pool, through listStoragePools API

* code improvements, and unit tests

* unit tests fixes

* Update config 'storage.pool.connected.clients.limit' to dynamic, and some improvements

* Stop SDC on host after migration if no volumes mapped to host

* Wait for SDC to connect after scini service start, and some log improvements

* Do not throw exception (log it) when SDC is not connected while revoking access for the powerflex volume

* some log improvements
2024-06-29 10:01:50 +05:30
Vishesh
90fe1d5fdc
Merge branch '4.19' 2024-06-29 03:35:24 +05:30