* api,agent,server,engine-schema: scalability improvements
Following changes and improvements have been added:
- Improvements in handling of PingRoutingCommand
1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
3. Optimized scanning stalled VMs
- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`
- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine
- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.
- Added caching for account/use role API access with expiration after write set to 60 seconds.
- Added caching for some recurring DB retrievals
1. CapacityManager - listing service offerings - beneficial in host capacity calculation
2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
3. DownloadListener - hypervisors for zone - beneficial for host joins
5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands
- Optimized MS list retrieval for agent connect
- Optimize finding ready systemvm template for zone
- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks
- Changes in agent-agentmanager connection with NIO client-server classes
1. Optimized the use of the executor service
2. Refactore Agent class to better handle connections.
3. Do SSL handshakes within worker threads
5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.
- Improvements in StatsCollection - minimize DB retrievals.
- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.
- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.
- Minor improvements in resource limit calculations wrt DB retrievals
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* test1, domaindetails, capacitymanager fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* test2 - agent tests
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* capacitymanagertest fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* change
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix missing changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address comments
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* revert marvin/setup.py
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix indent
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* use space in sql
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address duplicate
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* update host logs
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* revert e36c6a5d07
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix npe in capacity calculation
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* move schema changes to 4.20.1 upgrade
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* build fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address comments
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix build
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* add some more tests
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* checkstyle fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* remove unnecessary mocks
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* build fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* replace statics
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* engine/orchestration,utils: limit number of concurrent new agent
connections
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor - remove unused
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* unregister closed connections, monitor & cleanup
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* add check for outdated vm filter in power sync
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* agent: synchronize sendRequest wait
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
---------
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Improve logging to include more identifiable information for kvm plugin
* Update logging for scaleio plugin
* Improve logging to include more identifiable information for default volume storage plugin
* Improve logging to include more identifiable information for agent managers
* Improve logging to include more identifiable information for Listeners
* Replace ids with objects or uuids
* Improve logging to include more identifiable information for engine
* Improve logging to include more identifiable information for server
* Fixups in engine
* Improve logging to include more identifiable information for plugins
* Improve logging to include more identifiable information for Cmd classes
* Fix toString method for StorageFilterTO.java
* cli changes to update user/account, list by apikeyaccess, domain level setting
* UI changes for updating user/account and searchfilter in listview
* make the api parameters and setting accessible only to root admin
* revert changes to ui/package-lock.json
* minor changes to description strings
* UT for ApiServer and AccountManagerImpl classes
* fix pre-commit failure
* Added a constant for the string System
* UT for searchForUsers and searchForAccounts
* Fix marvin test error
* Update schema to use idempotent add column
* Fix `updateTemplatePermission` when the UI is set to a language other than English (#9766)
* Fix updateTemplatePermission UI in non-english language
* Improve fix
---------
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
* Added user name uuid to logging
* Add events when api key access is changed via api or config setting
* fix the userid for api key access update event
* Fix ut failure after event logging
* Convert drop down to radio-button in edit user and account
* Add ApiKeyAccess status in User InfoCard for Users if Api key is generated
* Return apiKeyAccess in user and account response only for Root Admin
* fixed noredist build failure
* Show apikeyaccess on the left panel in the user view for root admins as well
* don't show divider if apiKeyAccess is not shown to user
* Fix events generated to set Username, Account and Domain of the caller correctly
* cli changes to update user/account, list by apikeyaccess, domain level setting
* UI changes for updating user/account and searchfilter in listview
* make the api parameters and setting accessible only to root admin
* revert changes to ui/package-lock.json
* minor changes to description strings
* UT for ApiServer and AccountManagerImpl classes
* fix pre-commit failure
* Added a constant for the string System
* UT for searchForUsers and searchForAccounts
* Fix marvin test error
* Update schema to use idempotent add column
* Added user name uuid to logging
* Add events when api key access is changed via api or config setting
* fix the userid for api key access update event
* Fix ut failure after event logging
* Convert drop down to radio-button in edit user and account
* Add ApiKeyAccess status in User InfoCard for Users if Api key is generated
* Return apiKeyAccess in user and account response only for Root Admin
* fixed noredist build failure
* Show apikeyaccess on the left panel in the user view for root admins as well
* don't show divider if apiKeyAccess is not shown to user
* Fix events generated to set Username, Account and Domain of the caller correctly
* Added DB upgrade path from 42000 to 42010
---------
Co-authored-by: Daan Hoogland <daan@onecht.net>
Co-authored-by: Lucas Martins <56271185+lucas-a-martins@users.noreply.github.com>
Co-authored-by: Lucas Martins <lucas.martins@scclouds.com.br>
* Improvement: management server peer states
* Update pr9885: consider new mgmt server node which has msId=managementServerNodeId
* Update pr9885: update global config description
* Update pr9885: update label on UI
* framework: Do not update mshost_peer when mgmt server is Up as it will be updated by status update
* mgmt: Update state to Up when mgmt server writes heartbeat to db
* mgmt: change Service IP to Management IP
---------
Co-authored-by: Boris Stoyanov - a.k.a Bobby <bss.stoyanov@gmail.com>
Per docs, if the mysql connector is JDBC2 compliant then it should use
the Connection.isValid API to test a connection.
(https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#isValid-int-)
This would significantly reduce query lags and API throughput, as for
every SQL query one or two SELECT 1 are performed everytime a Connection
is given to application logic.
This should only be accepted when the driver is JDBC4 complaint.
As per the docs, the connector-j can use /* ping */ before calling
SELECT 1 to have light weight application pings to the server:
https://dev.mysql.com/doc/connector-j/en/connector-j-usagenotes-j2ee-concepts-connection-pooling.html
Replaces dbcp2 connection pool library with more performant HikariCP.
With this unit tests are failing but build is passing.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohityadav89@gmail.com>
Added caching for ConfigKey value retrievals based on the Caffeine
in-memory caching library.
https://github.com/ben-manes/caffeine
Currently, expire time for a cache is 30s and each update of the
config key invalidates the cache. On any update or reset of the
configuration, cache automatically invalidates for it.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* reface quotaTariffList process and add listOnlyRemoved parameter
* add unit tests for createQuotaTariffResponse and isUserAllowedToSeeActivationRules methods
* update QuotaTariffListCmdTest
* refactor quota tariffs creation
* refactor quota tariffs update
* fix unit test in JsInterpreter
* remove unused import
* refactor quota listing and add quota deletion
* add functionality to create tariff from UI, not working when specifying dates
* fix date parsing
* add labels
* fix details view of tariffs
* new update tariff view
* fix filter placeholder
* remove debug html
* add labels
* make value field to be required when updating a tariff
* add labels
* add portuguese labels
* remove unused label
* fix updating tariff when there was no enddate specified
* refactor dates
* refactor dates
* clear code
* update disabled dates in date picker
* clear ListView component
* fix unnecessary updates when the new end date was equal to the exising end date
* fix when today was selected to start date
* add keyword to filter
* change usage type response
* add keyword and usagetype filter on UI
* fix disabled end dates in date picker
* modify datepickers to use datetime
* small fixes
* make value an unrequired field on update form
* remove duplicate import
* remove unused css classes
* add UI support for position parameter
* resize input fields to fill all available horizontal space
* remove console.log()
* remove unnecessary fully qualified names
* replace `usagetypeid` property name to `id` on `listUsageTypes` API call
* replace `usagetypeid` property name to `id` on `listUsageTypes` API call
* Added API arg validator for RFC compliance domain name, to validate VM's host name
* Added unit tests for vm host/domain name validation
* Don't send sql exception/query from dao to upper layer, log it and send only the error message
* Updated user resources name / displayname(/text) column's charset to utf8mb4 to support emojis / unicode chars
* Check and update char set for affinity group name to utf8mb4, from the data migration in upgrade path
* Added smoke test to check resource name for vm, volume, service & disk offering, template, iso, account(first/lastname)
* Updated resource annotation charset to utf8mb4
* Updated some resources description charset to utf8mb4
* Updated sql stmt with constant
* Updated modify columns char set with idempotent procedure
* Removed delimiter (for creating procedures)
- mTLS implementation for cluster service communication
- Listen only on the specified cluster node IP address instead of all interfaces
- Validate incoming cluster service requests are from peer management servers based on the server's certificate dns name which can be through global config - ca.framework.cert.management.custom.san
- Hardening of KVM command wrapper script exeicution
- Improve API server integration port check
- cloudstack-management.default: don't have JMX configuration if not needed. JMX is used for instrumentation; users who need to use it should enable it explicitly
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Updates to change PUre and Primera to host-centric vlun assignments; various small bug fixes
* update to add timestamp when deleting pure volumes to avoid future conflicts
* update to migrate to properly check disk offering is valid for the target storage pool
* Updates to change PUre and Primera to host-centric vlun assignments; various small bug fixes
* update to add timestamp when deleting pure volumes to avoid future conflicts
* update to migrate to properly check disk offering is valid for the target storage pool
* improve error handling when copying volumes to add precision to which step failed
* rename pure volume before delete to avoid conflicts if the same name is used before its expunged on the array
* remove dead code in AdaptiveDataStoreLifeCycleImpl.java
* Fix issues found in PR checks
* fix session refresh TTL logic
* updates from PR comments
* logic to delete by path ONLY on supported OUI
* fix to StorageSystemDataMotionStrategy compile error
* change noisy debug message to trace message
* fix double callback call in handleVolumeMigrationFromNonManagedStorageToManagedStorage
* fix for flash array delete error
* fix typo in StorageSystemDataMotionStrategy
* change copyVolume to use writeback to speed up copy ops
* remove returning PrimaryStorageDownloadAnswer when connectPhysicalDisk returns false during KVMStorageProcessor template copy
* remove change to only set UUID on snapshot if it is a vmSnapshot
* reverting change to UserVmManagerImpl.configureCustomRootDiskSize
* add error checking/simplification per comments from @slavkap
* Update engine/storage/datamotion/src/main/java/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
* address PR comments from @sureshanaparti
---------
Co-authored-by: GLOVER RENE <rg9975@cs419-mgmtserver.rg9975nprd.app.ecp.att.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine). There would be three mechanisms for purging removed resources:
Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings.
API - New admin-only API purgeExpungedResources. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate. Currently, API is not supported in the UI.
Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge.
Following new global settings have been added:
expunged.resources.purge.enabled: Default: false. Whether to run a background task to purge the expunged resources
expunged.resources.purge.resources: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the expunged resources. Currently only VirtualMachine is supported. An empty "value will result in considering all resource types for purging
expunged.resources.purge.interval: Default: 86400. Interval (in seconds) for the background task to purge the expunged resources
expunged.resources.purge.delay: Default: 300. Initial delay (in seconds) to start the background task to purge the expunged resources task.
expunged.resources.purge.batch.size: Default: 50. Batch size to be used during expunged resources purging.
expunged.resources.purge.start.time: Default: (empty). Start time to be used by the background task to purge the expunged resources. Use format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.
expunged.resources.purge.keep.past.days: Default: 30. The number of days in the past from the execution time of the background task to purge the expunged resources for which the expunged resources must not be purged. To enable purging expunged resource till the execution of the background task, set the value to zero.
expunged.resource.purge.job.delay: Default: 180. Delay (in seconds) to execute the purging of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used.
Documentation PR: apache/cloudstack-documentation#397
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
* Add API for listing Quota preset variables
* Add new line at EOF
* Address review
* Remove usage types
* Remove usage types from quotatypes
* Remove unused imports
* Add space for preset variable definition description
Co-authored-by: Bernardo De Marco Gonçalves <bernardomg2004@gmail.com>
---------
Co-authored-by: Bernardo De Marco Gonçalves <bernardomg2004@gmail.com>
This adds a NPE check on the s_depot.global() which can cause NPE in
case of unit tests, where s_depot is not null but the underlying config
dao is null (not mocked or initialised) via `s_depot.global()` becomes
null.
This reverts commit 5f73172bcbe975e4ef416e525dc95bad63fa6d3a.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Co-authored-by: Bryan Lima <bryan.lima@hotmail.com>
Co-authored-by: SadiJr <sadi@scclouds.com.br>
Co-authored-by: Bryan Lima <42067040+BryanMLima@users.noreply.github.com>
Co-authored-by: Henrique Sato <henriquesato2003@gmail.com>
Add a global setting to control whether redirection is allowed while
downloading templates and volumes
core: some changes on SimpleHttpMultiFileDownloader
similar as HttpTemplateDownloader
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit b1642bc3bf58ccde9f56f632b5a9fe46a3eb5356)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>