* Update last agents during ms maintenance, and some code improvements
* Send 503 (Service Unavailable) response status when maintenance or shutdown is initiated
[Any load balancer in the clustered environment can avoid routing requests to this MS node]
* Migrate systemvm agents before routing host agents, and some code improvements
* Added events for ms maintenance and shutdown operations
* Added the following ms maintenance and shutdown improvements
- block new agent connections during prepare for maintenance of ms
- maintain avoids ms list
- propagate updated management servers list and lb algorithm in host and indirect.agent.lb.algorithm settings respectively, to systemvm (non-routing) agents
- updated setup ms list and migrate agent connections to executor service
- migrate agent connection through executor, and send the answer to the ms host that initiated the migration
- re-initialize ssl handshake executor if it is shutdown
- don't allow prepare for maintenance or shutdown when other management server nodes are in preparing states
- don't allow trigger shutdown when management server is up and other management server nodes are in preparing states
- stop agent connections monitor on ms maintenance
- update avoid ms list in ready command
- updated connected host from the client connection
- update last agents in ms metrics from the database
- updated some agent config descriptions
- update last management server in the hosts during shutdown
- added agents and lastagents in management server response
- updated management server maintenance & shutdown unit tests
- some code improvements
* refactored code / addressed comments
* removed shutdown testcase (maybe, calling System.exit)
* Revert "removed shutdown testcase (maybe, calling System.exit)"
This reverts commit e14b0717152ef6c8be102d61c80f42803a53172e.
* avoid system.exit during shutdown test
* code improvements
* testcase fix
* Fix cutoff time in agent connections monitor thread
* NAS B&R Plugin enhancements
* Prevent printing mount opts which may include password by removing from response
* revert marvin change
* add sanity checks to validate minimum qemu and libvirt versions
* check is user running script is part of libvirt group
* revert changes of retore expunged VM
* add code coverage ignore file
* remove check
* issue with listing schedules and add defensive checks
* redirect logs to agent log file
* add some more debugging
* remove test file
* prevent deletion of cks cluster when vms associated to a backup offering
* delete all snapshot policies when bkp offering is disassociated from a VM
* Fix `updateTemplatePermission` when the UI is set to a language other than English (#9766)
* Fix updateTemplatePermission UI in non-english language
* Improve fix
---------
* Add nobrl in the mountopts for cifs file system
* Fix restoration of VM / volumes with cifs
* add cifs utils for el8
* add cifs-utils for ubuntu cloudstack-agent
* syntax error
* remove required constraint on both vmid and id params for the delete bkp schedule command
Doc PR : https://github.com/apache/cloudstack-documentation/pull/461
This PR fixes https://github.com/apache/cloudstack/issues/8638
== Description
Four new Resource Types have been added. Admin can configure corresponding resource limits for the tenants at different levels (domain, account, project)
User dashboard's Storage section will show the new resources, their limits and current usage.
1. backup - No. of backups used by the account
2. backup_storage - Backup storage allocated for the account
3. bucket - No. of buckets used by the accounts
4. object_storage - Object storage allocated for the account.
Some other related changes done to BnR framework:
1. Maximum number of Backups to retain can be specified while creating Backup schedules, similar to Scheduled snapshots.
2. Oldest Scheduled backup of the same interval type will be deleted once the number reaches the configured max Backups value.
3. Code refactor: Moved syncBackups method from BackupProvider to the framework BackupManagerImpl, as it is a common functionality and all providers were using duplicated code.
Changes done to the Object Storage Framework
1. Quota parameter is made mandatory while creating a bucket. Bucket quota is considered to be the allocated space and will be used to enforce Resource limits.
== Schema Changes:
1. New Column `max_backups` added to `backup_schedule` table
4. New Column `backup_interval_type` added to `backups` table
== Api Changes:
1. createBackup: new Parameter `scheduleid`. It should be specified whenever a scheduled backup is created. This will translate to the `backup_interval_type` in the `backups` table.
3. createBackupScheduke: new Parameter `max_backups`. To specify maximum number of backups to retain for the given schedule.
== Configurations:
|Setting |Scope |Default Value |Description|
|-------|--------|--------------|-----------|
|backup.max.hourly |Global |8 |Maximum recurring hourly backups to be retained for an instance|
|backup.max.daily |Global |8 |Maximum recurring daily backups to be retained for an instance|
|backup.max.weekly |Global |8 |Maximum recurring weekly backups to be retained for an instance|
|backup.max.monthly |Global |8 |Maximum recurring monthly backups to be retained for an instance|
|max.account.backups| Global| 20 | The default maximum number of backups that can be created for an account|
|max.account.backup.storage| Global| 400 | The default maximum backup storage space (in GiB) that can be used for an account|
|max.domain.backups| Global| 40 | The default maximum number of backups that can be created for an domain|
|max.domain.backup.storage| Global| 800 | The default maximum backup storage space (in GiB) that can be used for an domain|
|max.project.backups| Global| 20 | The default maximum number of backups that can be created for an project|
|max.project.backup.storage| Global| 400 | The default maximum backup storage space (in GiB) that can be used for an project|
|Setting |Scope |Default Value |Description|
|-------|--------|--------------|-----------|
|max.account.buckets| Global| 20 | The default maximum number of buckets that can be created for an account|
|max.account.object.storage| Global| 400 | The default maximum object storage space (in GiB) that can be used for an account|
|max.domain.buckets| Global| 40 | The default maximum number of buckets that can be created for an domain|
|max.domain.object.storage| Global| 800 | The default maximum object storage space (in GiB) that can be used for an domain|
|max.project.buckets| Global| 20 | The default maximum number of buckets that can be created for an project|
|max.project.object.storage| Global| 400 | The default maximum object storage space (in GiB) that can be used for an project|
Co-authored-by: Daan Hoogland <daan@onecht.net>
Co-authored-by: Lucas Martins <56271185+lucas-a-martins@users.noreply.github.com>
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Show Usage Server configuration in a separate pane
* UI: Option to attach volume to an instance during create volume
* Show service ip in management server details tab
* change Schedule Snapshots to Recurring Snapshots
* Change the hypervisor order so that kvm, vmware, xenserver show up first
* Remove extra space in hypervisor names in config.java
* Fix `updateTemplatePermission` when the UI is set to a language other than English (#9766)
* Fix updateTemplatePermission UI in non-english language
* Improve fix
---------
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
* Autofill vcenter details in add cluster form
* UI: condition to display create vm-vol-snapshots to same as create vol-snapshots
* Fix alignment on wrapping in global settings tabs
* rename Autofill vCenter credentials to Autofill vCenter credentials from Zone
* Rename Service Ip to Ip Address in management server response
* Change description of kvm.snapshot.enabled to say that it applies to volume snapshots
* Return error when kvm vm snapshot is taken withoutsnapshot memory
* Minor naming changes and grammar
* Fix tooltip for attach volume to instance button
* Show Usage Server configuration in a separate pane
* UI: Option to attach volume to an instance during create volume
* Show service ip in management server details tab
* change Schedule Snapshots to Recurring Snapshots
* Change the hypervisor order so that kvm, vmware, xenserver show up first
* Remove extra space in hypervisor names in config.java
* Autofill vcenter details in add cluster form
* UI: condition to display create vm-vol-snapshots to same as create vol-snapshots
* Fix alignment on wrapping in global settings tabs
* rename Autofill vCenter credentials to Autofill vCenter credentials from Zone
* Rename Service Ip to Ip Address in management server response
* Change description of kvm.snapshot.enabled to say that it applies to volume snapshots
* Return error when kvm vm snapshot is taken withoutsnapshot memory
* Minor naming changes and grammar
* Fix tooltip for attach volume to instance button
* Show Usage Server configuration in a separate pane
* UI: Option to attach volume to an instance during create volume
* Show service ip in management server details tab
* change Schedule Snapshots to Recurring Snapshots
* Change the hypervisor order so that kvm, vmware, xenserver show up first
* Remove extra space in hypervisor names in config.java
* Autofill vcenter details in add cluster form
* UI: condition to display create vm-vol-snapshots to same as create vol-snapshots
* Fix alignment on wrapping in global settings tabs
* rename Autofill vCenter credentials to Autofill vCenter credentials from Zone
* Rename Service Ip to Ip Address in management server response
* Change description of kvm.snapshot.enabled to say that it applies to volume snapshots
* Return error when kvm vm snapshot is taken withoutsnapshot memory
* Minor naming changes and grammar
* Fix tooltip for attach volume to instance button
* UI: Option to attach volume to an instance during create volume
* UI: condition to display create vm-vol-snapshots to same as create vol-snapshots
* moved db changes from 41900to42000 to 42000to42010
* Update group_id in already present usage configuration settings
* remove "schedule" from message in create Recurring Snapshots form
* Update server/src/main/java/com/cloud/vm/snapshot/VMSnapshotManagerImpl.java
---------
Co-authored-by: Daan Hoogland <daan@onecht.net>
Co-authored-by: Lucas Martins <56271185+lucas-a-martins@users.noreply.github.com>
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
Co-authored-by: Boris Stoyanov - a.k.a Bobby <bss.stoyanov@gmail.com>
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
* api,agent,server,engine-schema: scalability improvements
Following changes and improvements have been added:
- Improvements in handling of PingRoutingCommand
1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
3. Optimized scanning stalled VMs
- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`
- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine
- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.
- Added caching for account/use role API access with expiration after write set to 60 seconds.
- Added caching for some recurring DB retrievals
1. CapacityManager - listing service offerings - beneficial in host capacity calculation
2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
3. DownloadListener - hypervisors for zone - beneficial for host joins
5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands
- Optimized MS list retrieval for agent connect
- Optimize finding ready systemvm template for zone
- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks
- Changes in agent-agentmanager connection with NIO client-server classes
1. Optimized the use of the executor service
2. Refactore Agent class to better handle connections.
3. Do SSL handshakes within worker threads
5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.
- Improvements in StatsCollection - minimize DB retrievals.
- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.
- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.
- Minor improvements in resource limit calculations wrt DB retrievals
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* test1, domaindetails, capacitymanager fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* test2 - agent tests
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* capacitymanagertest fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* change
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix missing changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address comments
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* revert marvin/setup.py
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix indent
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* use space in sql
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address duplicate
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* update host logs
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* revert e36c6a5d07
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix npe in capacity calculation
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* move schema changes to 4.20.1 upgrade
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* build fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address comments
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix build
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* add some more tests
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* checkstyle fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* remove unnecessary mocks
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* build fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* replace statics
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* engine/orchestration,utils: limit number of concurrent new agent
connections
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor - remove unused
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* unregister closed connections, monitor & cleanup
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* add check for outdated vm filter in power sync
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* agent: synchronize sendRequest wait
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
---------
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Support for Management Server Maintenance
- New APIs: prepareForMaintenance and cancelMaintenance, with required parameter - managementserverid.
- New management server states for maintenance: PreparingForMaintenance, Maintenance.
- listHosts API with optional parameter – managementserverid, to list the hosts connected to the management server.
- Support management server maintenance when more than one active management servers available.
- Triggers transfer agents to other available management servers for maintenance, new agent command MigrateAgentConnectionCommand to initiate transfer of indirect agents.
- New global config 'management.server.maintenance.timeout', to set the timeout (in mins) for the management server maintenance window, default: 60 mins.
- UI changes: Prepare and Cancel Maintenance in Management Server section, Connected Agents tab, New fields for hosts and management servers.
* Updated pending jobs check timer task with ScheduledExecutorService
* keep maintenance state on trigger shutdown call when ms is in maintenance
* add pending jobs count to ms response
* during ms heartbeat, update state to up only when it's down
* allow vm work jobs of async job created before prepare for maintenance
* Revert "keep maintenance state on trigger shutdown call when ms is in maintenance"
This reverts commit 607e13364679eac897f4d146bb3325ea7a61ba17.
* skip maintenance test when multiple management servers are not available, and not configured in host setting for kvm
* 4.20:
Rollback of changes with errors during the VM assign (#7061)
[VMware] Consider CD/DVD drive when calculating next free unit number for volume attachment over IDE controller (#9644)
consider a valid ipv4 address as a validish ipv4 /32 cidr (#10174)
Co-authored-by: Stephan Krug <stephan.krug@scclouds.com.br>
Co-authored-by: Gabriel <gabriel.fernandes@scclouds.com.br>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* 4.20:
merge errors fixed
Restrict the migration of volumes attached to VMs in Starting state (#9725)
server, plugin: enhance storage stats for IOPS (#10034)
Introducing granular command timeouts global setting (#9659)
Improve logging to include more identifiable information (#9873)
Adds framework layer change to allow retrieving and storing IOPS stats for storage pools. Custom `PrimaryStoreDriver` can implement method - `getStorageIopsStats` for returning IOPS stats. Existing method `getUsedIops` can also be overridden by such plugins when only used IOPS is returned.
For testing purpose, implementation has been added for simulator hypervisor plugin to return capacity and used IOPS for a pool.
For local storage pool, implementation has been added using iostat to return currently used IOPS.
StoragePoolResponse class has been updated to return IOPS values which allows showing IOPS values in UI for different storage pool related views and APIs.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Improve logging to include more identifiable information for kvm plugin
* Update logging for scaleio plugin
* Improve logging to include more identifiable information for default volume storage plugin
* Improve logging to include more identifiable information for agent managers
* Improve logging to include more identifiable information for Listeners
* Replace ids with objects or uuids
* Improve logging to include more identifiable information for engine
* Improve logging to include more identifiable information for server
* Fixups in engine
* Improve logging to include more identifiable information for plugins
* Improve logging to include more identifiable information for Cmd classes
* Fix toString method for StorageFilterTO.java
* 4.20:
VR: apply iptables rules when add/remove static routes (#10064)
Certificate and VM hostname validation improvements (#10051)
set ulimit for server according to redhat spec (#10040)
kvm-storage: provide isVMMigrate information to storage plugins (#10093)
Allow config drive deletion of migrated VM, on host maintenance (#10045)
linstor: improve heartbeat check with also asking linstor (#10105)
server: simplify role change validation (#9173)
UI: create VPC network offering with conserve mode (#10082)
server: fix typo removeaccessvpn in VirtualRouterElement (#10086)
UI: remove duplicated Instance Name in Public IP details page (#10087)
UI: Fixes in the Usage UI (#10000)
SAML2: add cookie with HttpOnly too #10013 (#10047)
ui: Allow font-awesome icon usage and optimise icon size inconsistency (#9744)
* 4.20:
UI: Fix userdata and load balancer selection (#10016)
Prevent password updates for SAML and LDAP users (#9999)
cloudstack-migrate-databases: sql AND added (#10033)
engine/schema: move SQLs to 4.20.0 to 4.20.1 upgrade (#10018)
Remove user from project before deletion (#10008)
Simplify validation for creating volume templates via UI (#9828)