1440 Commits

Author SHA1 Message Date
Abhishek Kumar
f6c16d9cfc refactor capacity calculation
Added capacity.calculate.workers config to control the number threads
that can be used to calculate capacities.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-21 17:00:56 +05:30
Abhishek Kumar
62ce43c2d5 agent: refactoring fixes for connection
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-21 13:10:09 +05:30
Abhishek Kumar
ad4c56a11f minor refactor for executor min-max
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-16 11:37:31 +05:30
Abhishek Kumar
053a19f551 refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-15 18:53:46 +05:30
Abhishek Kumar
79c257c454 minor change for ssl handshake pool factory
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-15 11:28:54 +05:30
Abhishek Kumar
d7bfa276b9 remove unused
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-11 16:06:37 +05:30
Abhishek Kumar
52c483d91d revert to constant backoff
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-11 12:19:50 +05:30
Abhishek Kumar
7ecc173187 changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-11 12:12:57 +05:30
Abhishek Kumar
3ebc6c3d1e agent ssl handshake timeout
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-09 17:19:18 +05:30
Abhishek Kumar
3e830252ab fix invalid range
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-08 17:36:55 +05:30
Abhishek Kumar
864afda935 backoff changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-08 16:03:31 +05:30
Abhishek Kumar
56d8ae371c minor test revert
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-07 16:36:03 +05:30
Abhishek Kumar
9afb25a969 fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-03 15:54:21 +05:30
Abhishek Kumar
f7f9249be5 fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-30 18:09:34 +05:30
Abhishek Kumar
d9269842d0 changes
- add timeout for CheckNetworkCommand
- use SychnonousQueue for ssl/tls handshake worker
- revert to TLSv1.2

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-30 17:40:42 +05:30
Abhishek Kumar
adae7c88b8 changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-27 13:00:38 +05:30
Abhishek Kumar
fa50740514 wip
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-26 12:10:38 +05:30
Abhishek Kumar
1652086ff9 wip changes for agent reconnection
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-25 18:33:55 +05:30
Abhishek Kumar
0ca8722c38 Merge remote-tracking branch 'apple/scalability-improvements' into scalability-improvements-fixes 2024-09-23 14:47:25 +05:30
Abhishek Kumar
df137fc387 refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-10 10:22:59 +05:30
Abhishek Kumar
61764aba1f cache and executors refactoring
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-09 19:39:50 +05:30
Abhishek Kumar
af53644a0b utils: add wrapper for the loading cache
Follow up for #9628
Creates a utility class LazyCache which currently wraps Caffeine library Cache class.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-05 13:29:22 +05:30
mprokopchuk
e0d6066935 Bumped pom version to 4.18.1.2 (to add migration SQL script) 2024-08-15 17:55:00 -07:00
Rohit Yadav
b46e4d4bbf
framework/cluster: improve cluster service and integration API service (#465)
- mTLS implementation for cluster service communication
- Listen only on the specified cluster node IP address instead of all interfaces
- Validate incoming cluster service requests are from peer management servers based on the server's certificate dns name which can be through global config - ca.framework.cert.management.custom.san
- Hardening of KVM command wrapper script execution
- Improve API server integration port check
- cloudstack-management.default: don't have JMX configuration if not needed. JMX is used for instrumentation; users who need to use it should enable it explicitly

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit 4f5561937cbfbdc722caf65b633d2dc57b44d31f)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-07-09 09:03:40 +05:30
Marcus Sorensen
f896586925
Update version to 4.18.1.1 (#417)
* Update version to 4.18.1.1

* Update changelog

* Update changelog

* Update changelog

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-08 09:27:57 -06:00
Abhishek Kumar
996ae9a959 engine-storage: control download redirection
Add a global setting to control whether redirection is allowed while
downloading templates and volumes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 09:23:17 +05:30
dahn
cfaac2a67e api: client verification in servlet
This introduces new global settings to handle how client address checks
are handled by the API layer:

proxy.header.verify: enables/disables checking of ipaddresses from a
                     proxy set header
proxy.header.names: a list of names to check for allowed ipaddresses
                    from a proxy set header.
proxy.cidr: a list of cidrs for which \"proxy.header.names\" are
            honoured if the \"Remote_Addr\" is in this list.

(cherry picked from commit b65546636d84a5790e0297b1b0ca8e5a67a48dbc)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-03-31 22:03:04 +05:30
Harikrishna
747d1101c1
New API "checkVolume" to check and repair any leaks or repair all issues (#362)
* Introduced a new API "checkVolumeAndRepair" that allows users or admins to check and repair if any leaks observed.
Currently this is supported only for KVM

* some fixes

* Added unit tests

* addressed review comments

* add repair volume while granting access

* Changed repair parameter to accept both leaks/all values

* Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations

* Added volume check and repair changes only during VM start and volume attach operations

* Refactored the names to look similar  across the code

* Some code fixes

* remove unused code

* Renamed repair values

* Addressed review comments

* code refactored

* used volume name in logs

* Changed the API to Async and the setting scope to storage pool

* Fixed exit value handling with check volume command

* Fixed storage scope to the setting

* Fixed volume format issues

* Refactored the log messages

* Fix formatting
2024-02-29 14:40:40 +05:30
Vishesh
f30e07b312
Fix host stuck in connecting state (#375)
* Fix host stuck in connecting state (#8502)

There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.

* Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)

This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)

* fixup
2024-02-21 13:44:53 +05:30
Abhishek Kumar
6f925f0022 kvm: fix direct download template size (#8093)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit ba24a18f27da7e727af384d914e758c887209153)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:11 +05:30
Wei Zhou
4bdff06acd Updating pom.xml version numbers for release 4.18.1.0
Signed-off-by: Wei Zhou <weizhou@apache.org>
2023-09-07 08:50:50 +02:00
Nicolas Vazquez
57c61fb33c
Fix direct download https compressed qcow2 template checker (#7932)
This PR fixes an issue on direct download while registering HTTPS compressed files
Fixes: #7929
2023-09-01 08:16:03 +02:00
Wei Zhou
c8d6e50539
VMware: add support for 8.0b (8.0.0.2), 8.0c (8.0.0.3) (#7380)
* VMware: add support for 8.0b (8.0.0.2)

* VMware 8: add new guest os mappings in VirtualMachineGuestOsIdentifier

The full list can be found at https://developer.vmware.com/apis/1355/vsphere

* VMware: get guest os mappings of parent version

* VMware8: remove guest os mappings for 8.0.0.2

* VMware8: fix code smells

* vmware: remove annotations in VmwareVmImplementerTest which caused 0.0% code coverage

* VMware8: add a unit test case

* VMware: add support for 8.0c (8.0.0.3)

* VMware8: move to CloudStackVersion.getVMwareParentVersion

* VMware: add support for 8.0u1 (8.0.1.0)

* Copy engine/schema/src/main/java/com/cloud/upgrade/GuestOsMapper.java from PR 6979

* Copy engine/schema/src/main/java/com/cloud/storage/dao/GuestOSHypervisorDao.java from PR 6979

* VMware: ignore the last number in VMware versions

* VMware: copy guest os mapping from 8.0 to 8.0.1

* VMware: add unit tests in VmwareVmImplementerTest.java

* Copy engine/schema/src/test/java/com/cloud/upgrade/GuestOsMapperTest.java from PR 6979

* VMware8: retry vm poweron if fails due to exception "File system specific implementation of Ioctl[file] failed"

This fixes a weird issue on vmware8. When power on a vm, sometimes it fails due to error

2023-04-27 07:04:43,207 ERROR [c.c.h.v.r.VmwareResource] (DirectAgent-442:ctx-cdd42b03 10.0.32.133, job-105/job-106, cmd: StartCommand) (logid:8a24a607) StartCommand failed due to [Exception: java.lang.RuntimeException
Message: File system specific implementation of Ioctl[file] failed
].
java.lang.RuntimeException: File system specific implementation of Ioctl[file] failed
        at com.cloud.hypervisor.vmware.util.VmwareClient.waitForTask(VmwareClient.java:426)
        at com.cloud.hypervisor.vmware.mo.VirtualMachineMO.powerOn(VirtualMachineMO.java:288)

in vmware.log on ESXi host, it shows

2023-04-27T09:20:41.713Z In(05)+ vmx - Power on failure messages: File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - Failed to lock the file
2023-04-27T09:20:41.713Z In(05)+ vmx - Cannot open the disk '/vmfs/volumes/7b29c876-ac102328/i-2-167-VM/ROOT-167.vmdk' or one of the snapshot disks it depends on.
2023-04-27T09:20:41.713Z In(05)+ vmx - Module 'Disk' power on failed.
2023-04-27T09:20:41.713Z In(05)+ vmx - Failed to start the virtual machine.

There is a KB article for it, but I still do not know why and how to fix it.
https://kb.vmware.com/s/article/1004232

* VMware: extract to method powerOnVM

* vmware: fix mistake in logs

* vmware8: use curl instead of wget to fix test failures

Traceback (most recent call last):
  File "/root/test_internal_lb.py", line 555, in test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80
    self.execute_internallb_roundrobin_tests(vpc_offering)
  File "/root/test_internal_lb.py", line 641, in execute_internallb_roundrobin_tests
    client_vm, applb.sourceipaddress, max_http_requests)
  File "/root/test_internal_lb.py", line 497, in run_ssh_test_accross_hosts
    (e, clienthost.public_ip))
AssertionError: list index out of range: SSH failed for VM with IP Address: 10.0.52.187

and

sshClient: DEBUG: {Cmd: /usr/bin/wget -T3 -qO- --user=admin --password=password http://10.1.2.253:8081/admin?stats via Host: 10.0.52.188} {returns: ["/usr/bin/wget: '/usr/lib/libpcre.so.1' is not an ELF file", "/usr/bin/wget: can't load library 'libpcre.so.1'"]}

* VMware: correct guest OS names in hypervisor mappings for VMware 8.0

el9 and variants were introduced by https://github.com/apache/cloudstack/pull/7059
they are supported with guest os identifiers since VMware 8.0

see https://vdc-repo.vmware.com/vmwb-repository/dcr-public/c476b64b-c93c-4b21-9d76-be14da0148f9/04ca12ad-59b9-4e1c-8232-fd3d4276e52c/SDK/vsphere-ws/docs/ReferenceGuide/vim.vm.GuestOsDescriptor.GuestOsIdentifier.html

* VMware: add Ubuntu 20.04 and 22.04 support for vmware 7.0+

* PR7380: only add guest os mappings for Ubuntu 20.04

* PR7380: Correct RHEL9 guest os names and others for VMware 8.0

* PR7380: correct guest os names on 8.0.0.1 as well

* PR7380: remove Windows 12 and Windows Server 2025 which are not released yet
2023-08-17 10:42:42 +02:00
Wei Zhou
90baae3dcd
utils: fix RBD URI if credentials contains slash (#7708) 2023-07-24 08:29:01 +02:00
Vishesh
594c70dde0
Sync precommit config from main (#7732)
Co-authored-by: John Bampton <jbampton@users.noreply.github.com>
Co-authored-by: dahn <daan@onecht.net>
2023-07-07 11:18:16 +02:00
Nicolas Vazquez
c733a23c90
Fix direct download URL checks (#7693)
This PR fixes the URL check for direct downloads, in the case of HTTPS URLs the certificates were not loaded into the SSL context
2023-07-06 13:47:13 +05:30
Wei Zhou
3e04779f60
console proxy: use AeadBase64Encryptor instead of AES/CBC/PKCS5Padding (#7237) 2023-07-05 11:01:32 +02:00
Abhishek Kumar
658daef715
utils: fix check for mrtalink url (#7636)
Fixes incorrect condition while checking metalink URLs for accesibility
2023-06-19 12:23:22 +05:30
Abhishek Kumar
30998d0ab7
server: fix userdatadetails parsing (#7328)
Fixes the case when userdata variable value contains '=' sign. This PR considers everything before occurrence of first '=' sign as key and remaining string as value.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2023-04-04 17:01:04 +05:30
Daan Hoogland
05cda2729f Updating pom.xml version numbers for release 4.18.1.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2023-03-15 19:38:14 +01:00
Daan Hoogland
0574087284 Updating pom.xml version numbers for release 4.18.0.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2023-03-11 09:35:41 +01:00
Harikrishna
a3feccf70c
User two factor authentication (#6924)
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-02-13 09:14:17 +01:00
Wei Zhou
62e342c1bc
utils,framework/db: Introduce new database encryption cipher based on AesGcmJce (#7003) 2023-02-02 16:25:49 +01:00
David Jumani
c774b865c9
Tungsten integration (#7065)
Co-authored-by: rtodirica <rtodirica@ena.com>
Co-authored-by: Huy Le <huylm@unitech.vn>
Co-authored-by: radu-todirica <Radu.Todirica@ness.com>
Co-authored-by: Huy Le <minh.le@ext.ewerk.com>
Co-authored-by: Simon Weller <siweller77@gmail.com>
Co-authored-by: dahn <daan@onecht.net>
2023-02-01 09:19:53 +01:00
João Jandre
61a722548f
Create API to reassign volume (#6938) 2023-01-27 11:10:56 +01:00
John Bampton
d74f64a2e1
Use lowercase HTTP header field names so we are compatible with HTTP/2 (#7006) 2023-01-23 11:17:54 +01:00
SadiJr
d04d60b079
[VMWare] Limit IOPS in Compute/Disk Offerings (#6386) 2023-01-17 14:41:56 +01:00
Daan Hoogland
16ec8105e4 Merge release branch 4.17 to main
* 4.17:
  utils: fix human-readable parsing failures (#7008)
2023-01-05 10:15:50 +01:00
Abhishek Kumar
89d4c7537f
utils: fix human-readable parsing failures (#7008)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2023-01-04 11:34:34 +01:00
Abhishek Kumar
194b0b4610 Merge remote-tracking branch 'apache/4.17' into main 2022-12-30 16:27:43 +05:30