662 Commits

Author SHA1 Message Date
Abhishek Kumar
bb5a2966b1 restore missing log
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-22 18:37:28 +05:30
Abhishek Kumar
16a541cd71 changes for timertask to runnable for agent self tasks
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-21 15:01:23 +05:30
Abhishek Kumar
62ce43c2d5 agent: refactoring fixes for connection
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-21 13:10:09 +05:30
Abhishek Kumar
c33aa96025 refactor, improve startuptask
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-17 19:30:26 +05:30
Abhishek Kumar
290f7a944f agent: allow not to exit on failure if serverresource has flag
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-17 19:29:25 +05:30
Abhishek Kumar
053a19f551 refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-15 18:53:46 +05:30
Abhishek Kumar
a932a9c2bb refactor agent hostname retrieval
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-15 11:28:35 +05:30
Abhishek Kumar
42111d8e83 fix EOF
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-11 12:30:57 +05:30
Abhishek Kumar
52c483d91d revert to constant backoff
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-11 12:19:50 +05:30
Abhishek Kumar
3ebc6c3d1e agent ssl handshake timeout
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-09 17:19:18 +05:30
Abhishek Kumar
a74403bd97 revert logging changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-04 16:49:39 +05:30
Abhishek Kumar
adae7c88b8 changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-27 13:00:38 +05:30
Abhishek Kumar
fa50740514 wip
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-26 12:10:38 +05:30
Abhishek Kumar
1652086ff9 wip changes for agent reconnection
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-25 18:33:55 +05:30
mprokopchuk
e0d6066935 Bumped pom version to 4.18.1.2 (to add migration SQL script) 2024-08-15 17:55:00 -07:00
Vishesh
c6d35b31ca
Log stdout to a file (#399)
* Log stdout to a file

* Add logrotation

* Fixup permissions in log file

* Remove info logs in stdout

* Change output file names

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Fix logrotate config

* Disable logging to stdout

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-07-03 20:51:20 +05:30
Wei Zhou
e065c93c3f
Apple FR76: Implicit host tags (#427)
* Merge two HostTagVO and HostTagDaoImpl

* Apple FR76: dynamic host tags

* Revert "Apple FR76: dynamic host tags"

This reverts commit 01b93a873f167018c4fafd0744c0de07ae4de4ed.

* Apple FR76: Implicit host tags

* Apple FR76: address Abhishek's comments

* Apple FR76: move updateImplicitTags

* Apple FR76: add since to other two responses

* Update 8929: add unit test in LibvirtComputingResourceTest

* Update variable names

* Update FR76: add explicithosttags in response

* Update FR76 UI: Update explicit host tags

* Update 8929: remove host tags and change labels on UI

* Update: ui polish for host tags

* fix since in responses

* Update 8929: fix UI error if no host tags
2024-05-30 17:20:37 +05:30
Marcus Sorensen
f896586925
Update version to 4.18.1.1 (#417)
* Update version to 4.18.1.1

* Update changelog

* Update changelog

* Update changelog

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-08 09:27:57 -06:00
Marcus Sorensen
ac4b030759
Mark libvirt events experimental, add properties flag (#404)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-03 08:03:26 -06:00
Vishesh
f30e07b312
Fix host stuck in connecting state (#375)
* Fix host stuck in connecting state (#8502)

There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.

* Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)

This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)

* fixup
2024-02-21 13:44:53 +05:30
Marcus Sorensen
f49265c14c
Fix missing code from backport of 4.16 version of dom0 CPU reserve (#374)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-02-05 11:53:45 +05:30
Vishesh
a7c7a33131
Apple base418 agent lock during reconnect (#340)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-11-03 16:56:15 +01:00
Marcus Sorensen
c01ed90569 Trigger out of band VM state update via libvirt event when VM stops (#7963)
* Trigger out of band VM state update via libvirt event when VM stops

* Add License headers, refactor nested try

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
(cherry picked from commit 3694667f5074018d6df58fac5e6a2803204f295b)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-09-28 12:22:15 +05:30
Marcus Sorensen
33a5c159a3 KVM Agent config to reserve dom0 CPUs (#326)
Cherry-picked from 1a722edfce6234319c5eaecf3cd0076fa9690267

Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-27 17:15:40 +05:30
Nicolas Vazquez
20952b4842 Auto Enable/Disable KVM hosts (#7170)
* Auto Enable Disable KVM hosts

* Improve health check result

* Fix corner cases

* Script path refactor

* Fix sonar cloud reports

* Fix last code smells

* Add marvin tests

* Fix new line on agent.properties to prevent host add failures

* Send alert on auto-enable-disable and add annotations when the setting is enabled

* Address reviews

* Add a reason for enabling or disabling a host when the automatic feature is enabled

* Fix comment on the marvin test description

* Fix for disabling the feature if the admin has manually updated the host resource state before any health check result

(cherry picked from commit be66eb2a35bd6a5aae74f97a9140a9ec01f2a838)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-09-15 13:15:59 +05:30
Wei Zhou
4bdff06acd Updating pom.xml version numbers for release 4.18.1.0
Signed-off-by: Wei Zhou <weizhou@apache.org>
2023-09-07 08:50:50 +02:00
dahn
2752c49fa7
agent: get the right controll cidr (#7580)
Fixes: #7574
2023-07-07 22:57:58 +05:30
Nicolas Vazquez
c733a23c90
Fix direct download URL checks (#7693)
This PR fixes the URL check for direct downloads, in the case of HTTPS URLs the certificates were not loaded into the SSL context
2023-07-06 13:47:13 +05:30
Marcus Sorensen
bdd5363314
Qemu migration hook: check for source length before using element 0 (#7482)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-05-08 11:59:42 +05:30
Daan Hoogland
05cda2729f Updating pom.xml version numbers for release 4.18.1.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2023-03-15 19:38:14 +01:00
Daan Hoogland
0574087284 Updating pom.xml version numbers for release 4.18.0.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2023-03-11 09:35:41 +01:00
Nicolas Vazquez
eac357cb77
kvm: Secure KVM VNC Console Access Using the CA Framework (#7015)
This PR allows securing the console access through CloudStack to the virtual machines running on KVM. The secure access is achieved through the generated certificates for the CA Framework in CloudStack, that provides mutual TLS connections between agents. These certificates are used to also secure the connection between the console proxies and the VNC ports for VM console access.

This feature is only supported on the KVM hypervisor

Design Document: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Secure+KVM+VNC+connection+using+the+CA+framework
2023-01-27 17:22:06 +05:30
slavkap
d288bb0c78
KVM support of iothreads and IO driver policy (#6909) 2023-01-25 12:34:05 +01:00
John Bampton
52c321a0c6
Fix spelling (#7087) 2023-01-16 10:56:07 +01:00
Paula Oliveira
0fe2e6950e
Improving code related to the Agent properties (#6348)
Co-authored-by: Paula Zomignani Oliveira <paula@scclouds.com.br>
Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
2022-12-22 12:00:49 +01:00
José Flauzino
1843632c24
Fix memory stats for KVM (#6358)
Co-authored-by: joseflauzino <jose@scclouds.com.br>
2022-11-09 18:00:12 +01:00
Nicolas Vazquez
b2fbe7bb12
console: Console access enhancements (#6577)
This PR creates a new API createConsoleAccess to create VM console URL allowing it to connect using other UI implementations. To avoid reply attacks, the console access is enhanced to use a one time token per session

New configuration added:
consoleproxy.extra.security.validation.enabled: Enable/disable extra security validation for console proxy using a token

Documentation PR: apache/cloudstack-documentation#284
2022-09-14 12:39:59 +05:30
Daan Hoogland
a470f3353a Merge branch '4.17' 2022-07-05 09:11:45 +02:00
John Bampton
7d23a0a759
Fix spelling (#6272) 2022-07-05 09:08:53 +02:00
nvazquez
0bcc609f05
Updating pom.xml version numbers for release 4.18.0.0-SNAPSHOT
Signed-off-by: nvazquez <nicovazquez90@gmail.com>
2022-06-06 12:25:35 -03:00
nvazquez
038a669d6b
Updating pom.xml version numbers for release 4.17.1.0-SNAPSHOT
Signed-off-by: nvazquez <nicovazquez90@gmail.com>
2022-06-06 12:19:44 -03:00
nvazquez
c56220fcf2
Updating pom.xml version numbers for release 4.17.0.0
Signed-off-by: nvazquez <nicovazquez90@gmail.com>
2022-05-31 14:33:47 -03:00
nvazquez
96594aec28
Merge branch '4.16' 2022-05-23 08:16:52 -03:00
Nicolas Vazquez
dc975dff95
[KVM] Enable IOURING only when it is available on the host (#6399)
* [KVM] Disable IOURING by default on agents

* Refactor

* Remove agent property for iouring

* Restore property

* Refactor suse check and enable on ubuntu by default

* Refactor irrespective of guest OS

* Improvement

* Logs and new path

* Refactor condition to enable iouring

* Improve condition

* Refactor property check

* Improvement

* Doc comment

* Extend comment

* Move method

* Add log
2022-05-23 08:11:14 -03:00
Wei Zhou
8f39a049bb
agent: enable ssl only for kvm agent (not in system vms) (#6371)
* agent: enable ssl only for kvm agent (not in system vms)

* Revert "agent: enable ssl only for kvm agent (not in system vms)"

This reverts commit b2d76bad2e9455384c4ac34cee6763014e255eb6.

* Revert "KVM: Enable SSL if keystore exists (#6200)"

This reverts commit 4525f8c8e7ffecf50eff586ccfbc3d498f1b8021.

* KVM: Enable SSL if keystore exists in LibvirtComputingResource.java
2022-05-12 07:01:55 -03:00
Wei Zhou
4525f8c8e7
KVM: Enable SSL if keystore exists (#6200)
* KVM: Enable SSL if keystore exists

* Update #6200: add logs if no passphrase or no keystore
2022-04-22 11:51:21 -03:00
Pearl Dsilva
830f3061bc
SystemVM optimizations (#5831)
* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency

* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp

* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup

* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes

* Add ssh to k8s nodes details in the Access tab on the UI

* test

* Refactor ca/cert patching logic

* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script

* remove all references of systemvm.iso

* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs

* fix script timeout

* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand

* remove commented code + change core user to cloud for cks nodes

* Update ownership of ssh directory

* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)

* Add UI changes + move changes from patch file to runcmd

* test: validate performance for template modification during seeding

* create vms folder in cloudstack-commons directory - debian rules

* remove logic for on the fly template convert + update k8s test

* fix syntax issue - causing issue with shared network tests

* Code cleanup

* refactor patching logic - certs

* move logic of fixing rootdiskcontroller from upgrade to kubernetes service

* add livepatch option to restart network & vpc

* smooth upgrade of cks clusters

* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency

* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp

* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup

* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes

* Add ssh to k8s nodes details in the Access tab on the UI

* test

* Refactor ca/cert patching logic

* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script

* remove all references of systemvm.iso

* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs

* fix script timeout

* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand

* remove commented code + change core user to cloud for cks nodes

* Update ownership of ssh directory

* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)

* Add UI changes + move changes from patch file to runcmd

* test: validate performance for template modification during seeding

* create vms folder in cloudstack-commons directory - debian rules

* remove logic for on the fly template convert + update k8s test

* fix syntax issue - causing issue with shared network tests

* Code cleanup

* add cgroup config for containerd

* add systemd config for kubelet

* add additional info during image registry config

* address comments

* add temp links of download.cloudstack.org

* address part of the comments

* address comments

* update containerd config - as version has upgraded to 1.5 from 1.4.12 in 4.17.0

* address comments - simplify

* fix vue3 related icon changes

* allow network commands when router template version is lower but is patched

* add internal LB to the list of routers to be patched on network restart with live patch

* add unit tests for API param validations and new helper utilities - file scp & checksum validations

* perform patching only for non-user i.e., system VMs

* add test to validate params

* remove unused import

* add column to domain_router to display software version and support networkrestart with livePatch from router view

* Requires upgrade column to consider package (cloud-scripts) checksum to identify if true/false

* use router software version instead of checksum

* show N/A if no software version reported i.e., in upgraded envs

* fix deb failure

* update pom to official links of systemVM template
2022-04-21 13:40:19 -03:00
nvazquez
65dc2df896
Merge branch '4.16' 2022-04-13 08:45:48 -03:00
slavkap
42a92dcdd3
Extract the IO_URING configuration into the agent.properties (#6253)
When using advanced virtualization the IO Driver is not supported. The
admin will decide if want to enable/disable this configuration from
agent.properties file. The default value is true
2022-04-13 08:43:35 -03:00
Nicolas Vazquez
2f2d6cbe38
KVM: Enhance CPU speed detection on hosts (#6175)
* KVM: Enhance CPU speed detection on hosts

* Update agent/conf/agent.properties

Co-authored-by: dahn <daan.hoogland@gmail.com>

Co-authored-by: dahn <daan.hoogland@gmail.com>
2022-04-01 12:12:34 -03:00