cloudstack

mirror of https://github.com/apache/cloudstack.git synced 2025-11-03 04:12:31 +01:00

Author	SHA1	Message	Date
Abhishek Kumar	47333a7077	Merge remote-tracking branch 'apple/apple-base418' into scalability-improvements	2024-10-11 13:37:12 +05:30
Wei Zhou	85076cb0f8	Resize volume: add pool capacity disablethreshold for resize and allow volume auto migration (#492 ) * server: add global settings for volume resize * resizeVolume: support automigrate * server: fix build errors as it is backported from 4.20/main * Address Suresh's comments * Update api/src/main/java/org/apache/cloudstack/api/command/user/volume/ResizeVolumeCmd.java Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com> * Apple issue-299: address Suresh's comments * Update api/src/main/java/org/apache/cloudstack/api/command/user/volume/ResizeVolumeCmd.java * UI: add autoMigrate to resizeVolume --------- Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>	2024-10-09 16:36:33 +05:30
Abhishek Kumar	a5d02665b4	changes for host reqrieval from db Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-10-07 16:21:06 +05:30
Abhishek Kumar	0ca8722c38	Merge remote-tracking branch 'apple/scalability-improvements' into scalability-improvements-fixes	2024-09-23 14:47:25 +05:30
Abhishek Kumar	1d0b90f984	Merge remote-tracking branch 'apple/apple-base418' into scalability-improvements	2024-09-23 14:45:21 +05:30
Abhishek Kumar	df137fc387	refactor Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-09-10 10:22:59 +05:30
mprokopchuk	e0d6066935	Bumped pom version to 4.18.1.2 (to add migration SQL script)	2024-08-15 17:55:00 -07:00
Abhishek Kumar	5e98405b38	Merge remote-tracking branch 'apple/apple-base418' into scalability-improvements	2024-07-22 16:12:19 +05:30
Suresh Kumar Anaparti	be87b1a668	FR74: Mitigation for non-scalable ScaleIO clients (#447 ) * Mitigation for non-scalable Powerflex/ScaleIO clients - Added ScaleIOSDCManager to manage SDC connections, checks clients limit, prepare and unprepare SDC on the hosts. - Added commands for prepare and unprepare storage clients to prepare/start and stop SDC service respectively on the hosts. - Introduced config 'storage.pool.connected.clients.limit' at storage level for client limits, currently support for Powerflex only. * tests issue fixed * refactor / improvements * lock with powerflex systemid while checking connections limit * updated powerflex systemid lock to hold till sdc preparation * Added custom stats support for storage pool, through listStoragePools API * code improvements, and unit tests * Update config 'storage.pool.connected.clients.limit' to dynamic, and some improvements * Stop SDC on host after migration if no volumes mapped to host * Wait for SDC to connect after scini service start, and some log improvements * Do not throw exception (log it) when SDC is not connected while revoking access for the powerflex volume * some log improvements	2024-06-27 18:47:50 +05:30
Vishesh	c2de75744e	kvm: Add support for cgroupv2 (#8252 ) (#459 ) * kvm: Add support for cgroupv2 (#8252) 1. Problem description In Apache CloudStack (ACS), when a VM is deployed in a host with the KVM hypervisor, an XML file is created in the assigned host, which has a property shares that defines the weight of the VM to access the host CPU. The value of this property has no unit, and it is a relative measure to calculate how much CPU a given VM will have in the host. However, this value has a limit, which depends on the version of cgroup utilized by the host's kernel. The problem lies at the range value of shares that varies between both versions: [2, 264144] for cgroups version 1; and [1, 10000] for cgroups version 2. Currently, ACS calculates the value of shares using Equation 1, presented below, where CPU is the number of cores and speed is the CPU frequency; both specified in the VM's compute offering. Therefore, if a compute offering has, for example, 6 cores at 2 GHz, the shares value will be 12000 and an exception will be thrown by libvirt if the host utilizes cgroup v2. The second version is becoming the default one in current Linux distributions; thus, it is necessary to address this limitation. Equation 1 shares = CPU * speed Fixes: #6744 2. Proposed changes To address the problem described, we propose to apply a scale conversion considering the max shares of the host. Using the same formula currently utilized by ACS, it is possible to calculate the maximum shares of a VM for a given host. In other words, using the number of cores and the nominal speed of the host's CPU as the upper limit of shares allowed to a VM. Then, this value will be scaled to the allowed interval of [1, 10000] of cgroup v2 by using a linear scale conversion. The VM shares would be calculated as Equation 2, presented below, where VM requested shares is the requested shares value calculated using Equation 1, cgroup upper limit is fixed with a value of 10000 (cgroups v2 upper limit), and host max shares is the maximum shares value of the host, calculated using Equation 1. Using Equation 2, the only case where a VM passes the cgroup v2 limit is when the user requests more resources than the host has, which is not possible with the current implementation of ACS. Equation 2 shares = (VM requested shares * cgroup upper limit)/host max shares To implement the proposal, the following APIs will be updated: deployVirtualMachine, migrateVirtualMachine and scaleVirtualMachine. When a VM is being deployed, a new verification will be added to find a suitable host. The max shares of each host will be calculated, and the VM calculated shares will be verified if it does not surpass the host's value. Likewise, the migration of VMs will have a similar new verification. Lastly, the scale of VMs will also have the same verification for the VM's host. To determine the max shares of a given host, we will use the same equation currently used in ACS for calculating the shares of VMs, presented in Section 1. When Equation 1 is used to determine the maximum shares of a host, CPU is the number of cores of the host, and speed is the nominal CPU speed, i.e., considering the CPU's base frequency. It is important to note that these changes are only for hosts with the KVM hypervisor using cgroup v2 for now. * Update overcommit ratio during live VM migration * minor refactoring --------- Co-authored-by: Bryan Lima <42067040+BryanMLima@users.noreply.github.com>	2024-06-27 12:22:17 +05:30
Abhishek Kumar	8f88103a29	FR72 - api,server: purge expunged resources (#405 ) This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine). There would be three mechanisms for purging removed resources: - Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings. - API - New API `purgeExpungedResources`. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate - Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge. Following new global settings have been added: - `expunged.resources.purge.enabled`: Default: false. Whether to run a background task to purge the DB records of the expunged resources. - `expunged.resources.purge.resources`: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the DB records of the expunged resources. Currently only VirtualMachine is supported. An empty value will result in considering all resource types for purging. - `expunged.resources.purge.interval`: Default: 86400. Interval (in seconds) for the background task to purge the DB records of the expunged resources. - `expunged.resources.purge.delay`: Default: 300. Initial delay (in seconds) to start the background task to purge the DB records of the expunged resources task. - `expunged.resources.purge.batch.size`: Default: 50. Batch size to be used during purging of the DB records of the expunged resources. - `expunged.resources.purge.start.time`: Default: (empty). Start time to be used by the background task to purge the DB records of the expunged resources. Use format `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss`. - `expunged.resources.purge.keep.past.days`: Default: 30. The number of days in the past from the execution time of the background task to purge the DB records of the expunged resources for which the expunged resources must not be purged. To enable purging DB records of the expunged resource till the execution of the background task, set the value to zero. - `expunged.resource.purge.job.delay`: Default: 180. Delay (in seconds) to execute the purging of the DB records of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used. Upstream PRs: https://github.com/apache/cloudstack/pull/8999 https://github.com/apache/cloudstack-documentation/pull/397 Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>	2024-06-19 12:59:50 +05:30
Suresh Kumar Anaparti	04091abc0d	User data content size validation, register managed user data using POST call from UI, and related code improvements (#361 ) * Validate user data with actual length, and some code improvements * Ignore if user data is not set (don't fail) * Validate user data after finalizing it * Updated registerUserData API using POST call from UI, to support user data upto 1048576 bytes * Apply suggestions from code review * Added logs for user data * Addressed review comments * Check user data length with base64 encoded data, and some code improvements	2024-06-19 12:54:32 +05:30
Rohit Yadav	5603bf9c1a	engine: optimise CPU and DB hotspot to return enabled hypervisors in the zone This refactors a ResourceManager::listAvailHypervisorInZone method that should return unique hypervisors for which existing hosts are Up and processed. We can approximate this by assuming that those hosts would have setup their hypervisor-specific systemvmtemplates. In a given environment there wouldn't be thousands of systemvmtemplates, but can have thousands of hosts. So, instead of scanning the entire cloud.host table, we can make calculate guess by returning unique hypervisors of systemvm templates which are ready. This method was used in ::processConnect() when an agent joins, to speed up its handling. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2024-05-22 20:22:39 +05:30
Vishesh	2f4cea6dca	Fix message publish in transaction (#438 ) * Fix message publish in transaction * Resolve comments	2024-05-07 13:27:19 +05:30
Marcus Sorensen	f896586925	Update version to 4.18.1.1 (#417 ) * Update version to 4.18.1.1 * Update changelog * Update changelog * Update changelog --------- Co-authored-by: Marcus Sorensen <mls@apple.com>	2024-04-08 09:27:57 -06:00
Abhishek Kumar	996ae9a959	engine-storage: control download redirection Add a global setting to control whether redirection is allowed while downloading templates and volumes Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-04-01 09:23:17 +05:30
Wei Zhou	2b93886934	server: fix security issues caused by extraconfig on KVM - Move allow.additional.vm.configuration.list.kvm from Global to Account setting - Disallow VM details start with "extraconfig" when deploy VMs - Skip changes on VM details start with "extraconfig" when update VM settings - Allow only extraconfig for DPDK in service offering details - Check if extraconfig values in vm details are supported when start VMs - Check if extraconfig values in service offering details are supported when start VMs - Disallow add/edit/update VM setting for extraconfig on UI (cherry picked from commit e6e4fe16fb1ee428c3664b6b57384514e5a9252e) Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2024-03-31 22:02:26 +05:30
Harikrishna	747d1101c1	New API "checkVolume" to check and repair any leaks or repair all issues (#362 ) * Introduced a new API "checkVolumeAndRepair" that allows users or admins to check and repair if any leaks observed. Currently this is supported only for KVM * some fixes * Added unit tests * addressed review comments * add repair volume while granting access * Changed repair parameter to accept both leaks/all values * Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations * Added volume check and repair changes only during VM start and volume attach operations * Refactored the names to look similar across the code * Some code fixes * remove unused code * Renamed repair values * Addressed review comments * code refactored * used volume name in logs * Changed the API to Async and the setting scope to storage pool * Fixed exit value handling with check volume command * Fixed storage scope to the setting * Fixed volume format issues * Refactored the log messages * Fix formatting	2024-02-29 14:40:40 +05:30
Vishesh	f30e07b312	Fix host stuck in connecting state (#375 ) * Fix host stuck in connecting state (#8502) There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs. #8438 (comment) #8433 (comment) #7344 (comment) While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout). On the agent side, it gets stuck waiting for a response from the Script execution. To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query. SELECT * FROM performance_schema.metadata_locks INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID WHERE PROCESSLIST_ID <> CONNECTION_ID() \G; This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released. * Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547) This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release. We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status. We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds). For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand. Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below. `19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)` * fixup	2024-02-21 13:44:53 +05:30
Abhishek Kumar	6a9cdedda4	api,server,ui: tagged resource limits (#327 ) Introduces the concept of tagged resource limits. Limits can be enforced on accounts and domains for the deployment of entities for a tagged resource. Current tagged resource limits can be used for the following resource types, Host limits user_vm cpu memory Storage limits volume primary_storage Following global settings can used to specify tags for which limit needs to be enforced, Host: resource.limit.host.tags Storage: resource.limit.storage.tags Option for specifying tagged resource limits and viewing tagged resource usage are made available in the UI. Enhances use of templatetag for VM deployment and template creation Adds option to list disk offering with suitability flag for a virtualmachine. A new parameter named virtualmachineid has been added to the listDiskOfferings API which when passed returns suitableforvirtualmachine param in the reponse.	2024-02-07 17:35:15 +05:30
Suresh Kumar Anaparti	7fef155621	Remove sensitive params (VmPassword, etc) from VMWork log (#369 ) * Remove sensitive params (VmPassword, etc) from VMWork log * Added unit tests * review comments	2024-01-24 17:49:20 +05:30
Abhishek Kumar	160a13a029	userdata: fix append scenarios (#7741 ) Fixes case of appending userdata when both template and vm data are either shellscript or cloudconfig Fixes error when appending gzip userdata Fixes case when userdata manual text from VM is not getting decoded-encoded correctly. Fixes case of appending multipart data when both template and vm data contain same format types. Refactor - moved validateUserData method to UserDataManager class Refactor userdata test to check resultant multipart userdata thoroughly Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> (cherry picked from commit 729e6d144655bd26e6453dcc01a7e6f0d5c8f50e) Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2023-09-28 12:48:46 +05:30
Nicolas Vazquez	20952b4842	Auto Enable/Disable KVM hosts (#7170 ) * Auto Enable Disable KVM hosts * Improve health check result * Fix corner cases * Script path refactor * Fix sonar cloud reports * Fix last code smells * Add marvin tests * Fix new line on agent.properties to prevent host add failures * Send alert on auto-enable-disable and add annotations when the setting is enabled * Address reviews * Add a reason for enabling or disabling a host when the automatic feature is enabled * Fix comment on the marvin test description * Fix for disabling the feature if the admin has manually updated the host resource state before any health check result (cherry picked from commit be66eb2a35bd6a5aae74f97a9140a9ec01f2a838) Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2023-09-15 13:15:59 +05:30
Wei Zhou	4bdff06acd	Updating pom.xml version numbers for release 4.18.1.0 Signed-off-by: Wei Zhou <weizhou@apache.org>	2023-09-07 08:50:50 +02:00
Wei Zhou	3c38ed7a65	server: allow user to list available IPs on shared networks (#7898 ) This fixes #7817	2023-08-24 21:42:01 +05:30
Vishesh	594c70dde0	Sync precommit config from main (#7732 ) Co-authored-by: John Bampton <jbampton@users.noreply.github.com> Co-authored-by: dahn <daan@onecht.net>	2023-07-07 11:18:16 +02:00
Harikrishna	b774ee5d11	vmware: Datastore cluster synchronization should check if the child datastores are in UP state or not (#7385 ) This fix ensures when datastore cluster in VMware is added as a primary storage pool in CloudStack then all the child datastores (which already exists in CS) should be in Up state. For example: 1. Datastore Cluster DS has two child datastores A and B in vCenter. (B is already added as a storage pool in CloudStack) 2. Now try to add datastore cluster DS into CloudStack as a primary storage pool 3. CloudStack tries to add child datastores A and B in CloudStack, since B is already there in CloudStack, it will reuse the existing storagepool entry and will keep under parent Storage pool DS. During Step 3 we are now checking if B is Up state or not.	2023-04-11 22:23:12 +05:30
Daan Hoogland	05cda2729f	Updating pom.xml version numbers for release 4.18.1.0-SNAPSHOT Signed-off-by: Daan Hoogland <daan@onecht.net>	2023-03-15 19:38:14 +01:00
Daan Hoogland	0574087284	Updating pom.xml version numbers for release 4.18.0.0 Signed-off-by: Daan Hoogland <daan@onecht.net>	2023-03-11 09:35:41 +01:00
David Jumani	c774b865c9	Tungsten integration (#7065 ) Co-authored-by: rtodirica <rtodirica@ena.com> Co-authored-by: Huy Le <huylm@unitech.vn> Co-authored-by: radu-todirica <Radu.Todirica@ness.com> Co-authored-by: Huy Le <minh.le@ext.ewerk.com> Co-authored-by: Simon Weller <siweller77@gmail.com> Co-authored-by: dahn <daan@onecht.net>	2023-02-01 09:19:53 +01:00
Suresh Kumar Anaparti	d8c7e34b38	Improve global settings UI to be more intuitive/logical (#5797 ) Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com> Co-authored-by: nvazquez <nicovazquez90@gmail.com> Co-authored-by: davidjumani <dj.davidjumani1994@gmail.com> Co-authored-by: dahn <daan.hoogland@gmail.com> Co-authored-by: dahn <daan@onecht.net>	2023-01-31 11:23:43 +01:00
Abhishek Kumar	3b6ce97097	infra: edge zones (#6840 ) Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: dahn <daan@onecht.net>	2023-01-31 09:36:45 +01:00
slavkap	d288bb0c78	KVM support of iothreads and IO driver policy (#6909 )	2023-01-25 12:34:05 +01:00
John Bampton	52c321a0c6	Fix spelling (#7087 )	2023-01-16 10:56:07 +01:00
Pearl Dsilva	3044d63a8b	Configurable MTU for VR (#6426 ) Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2023-01-04 09:42:24 +01:00
Wei Zhou	889045fba5	new plugins: Add non-strict affinity groups (#6845 )	2022-12-20 15:09:52 +01:00
Wei Zhou	a63b2aba7a	VM Autoscaling with virtual router (#6571 )	2022-12-05 15:23:03 +01:00
João Jandre	14937e1adb	Fixed NPE on volume creation from snapshot (#6839 ) Co-authored-by: João Jandre <joao@scclouds.com.br>	2022-10-26 08:44:01 +02:00
Stephan Krug	b8d834e759	quota: Improves email configurations descriptions (#6806 ) The alert.email.addresses description is ambiguous and can cause doubts to operators. This description has been altered to avoid confusion. In addition, typos in alert.smtp.useStartTLS and project.smtp.useStartTLS have been fixed. Co-authored-by: Stephan Krug <stephan.krug@scclouds.com.br>	2022-10-08 11:59:55 +05:30
GaOrtiga	3889e46eb6	fix description of configuration `max.data.migration.wait.time` (#6749 ) Co-authored-by: Gabriel Ortiga Fernandes <gabriel.fernandes@scclouds.com.br>	2022-09-24 20:27:34 +02:00
Nicolas Vazquez	b2fbe7bb12	console: Console access enhancements (#6577 ) This PR creates a new API createConsoleAccess to create VM console URL allowing it to connect using other UI implementations. To avoid reply attacks, the console access is enhanced to use a one time token per session New configuration added: consoleproxy.extra.security.validation.enabled: Enable/disable extra security validation for console proxy using a token Documentation PR: apache/cloudstack-documentation#284	2022-09-14 12:39:59 +05:30
Abhishek Kumar	78b68fd7e6	api,server: custom dns for guest network (#6425 ) Adds option to provide custom DNS servers for isolated network, shared network and VPC tier. New API parameters added in createNetwork API along with the corresponding response parameters. Doc PR: apache/cloudstack-documentation#276	2022-09-10 13:05:40 +05:30
João Jandre	9c63c39371	Add new parameter to createLoadBalancerRule API (#6460 ) * Add new parameter to createLoadBalancerRule API * address review Co-authored-by: João Paraquetti <joao@scclouds.com.br>	2022-08-08 10:48:21 +02:00
dahn	731a83babf	add global setting to allow parallel execution on vmware (#6413 ) * add global setting to allow parallel execution on vmware * cleanup setting distribution for vmware.create.full.clone * query setting in vmware guru * don´t touch other hypervisor's commands * guru hierarchy cleanup	2022-07-15 10:01:35 +02:00
Wei Zhou	ff7831d751	Merge remote-tracking branch 'apache/4.17'	2022-06-28 08:27:36 +02:00
Suresh Kumar Anaparti	c70bc9d69c	kvm: Updated PowerFlex/ScaleIO storage plugin to support separate (storage) network for Hosts(KVM)/Storage connection. (#6367 ) This PR enhances the existing PowerFlex/ScaleIO storage plugin to support separate (storage) network for Hosts(KVM)/Storage connection, mainly the SDC (ScaleIo Data Client) connection.	2022-06-27 14:42:51 +05:30
nvazquez	84eed6db72	Merge branch '4.17'	2022-06-10 08:28:41 -03:00
dahn	90a0ee0b6c	fix pseudo random behaviour in pool selection (#6307 ) * refactor and log trace * tracelogs * shuffle pools with real randomiser * sinlge retrieval of async job context * some review comments addressed * Apply suggestions from code review Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com> * log formatting * integration test for distribution of volumes over storages * move test to smoke tests * imports * sonarcloud issue # AYCOmVntKzsfKlhz0HDh * spellos * review comments * review comments * sonarcloud issues * unittest * import * Update AbstractStoragePoolAllocatorTest.java Co-authored-by: Daan Hoogland <dahn@onecht.net> Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>	2022-06-10 08:06:23 -03:00
nvazquez	0bcc609f05	Updating pom.xml version numbers for release 4.18.0.0-SNAPSHOT Signed-off-by: nvazquez <nicovazquez90@gmail.com>	2022-06-06 12:25:35 -03:00
nvazquez	038a669d6b	Updating pom.xml version numbers for release 4.17.1.0-SNAPSHOT Signed-off-by: nvazquez <nicovazquez90@gmail.com>	2022-06-06 12:19:44 -03:00

1 2 3 4 5 ...

327 Commits