cloudstack

mirror of https://github.com/apache/cloudstack.git synced 2025-10-26 08:42:29 +01:00

Author	SHA1	Message	Date
nvazquez	b73f634ea6	Merge branch '4.19'	2024-08-06 12:39:13 -03:00
João Jandre	9033ab709e	Fix snapshot chain being deleted on XenServer (#9447 ) Using XenServer as the hypervisor, when deleting a snapshot that has a parent, that parent will also get erased on storage, causing data loss. This behavior was introduced with #7873, where the list of snapshot states that can be deleted was changed to add BackedUp snapshots. This PR changes the states list back to the original list, and swaps the while loop for a do while loop to account for the changes in #7873. Fixes #9446	2024-08-01 17:33:04 +05:30
Suresh Kumar Anaparti	3faf7cd2f1	Updating pom.xml version numbers for release 4.19.2.0-SNAPSHOT Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>	2024-07-19 10:29:26 +05:30
Suresh Kumar Anaparti	9f4c895974	Updating pom.xml version numbers for release 4.19.1.0 Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>	2024-07-15 17:19:29 +05:30
John Bampton	c923e673cf	pre-commit: add `XML` files to the `trailing-whitespace` check (#9131 )	2024-07-12 09:42:54 +02:00
Rohit Yadav	cea4801be1	Merge remote-tracking branch 'origin/4.19' Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2024-07-10 15:57:06 +05:30
Rene Glover	32cc1d46a5	Copy on pool host when storage pool has ScopeType.HOST (#9356 )	2024-07-10 12:30:47 +05:30
Vishesh	7c32bd2506	Fixup main build errors (#9330 ) * Fixup main build errors * Fixup flaky test * Address comments	2024-07-04 13:00:37 +05:30
Vishesh	0ec7c72875	Merge branch '4.19'	2024-07-01 12:41:45 +05:30
Abhisar Sinha	063dc60114	Change storage pool scope from Cluster to Zone and vise versa (#8875 ) * New feature: Change storage pool scope * Added checks for Ceph/RBD * Update op_host_capacity table on primary storage scope change * Storage pool scope change integration test * pull 8875 : Addressed review comments * Pull 8875: remove storage checks, AbstractPrimayStorageLifeCycleImpl class * Pull 8875: Fixed integration test failure * Pull 8875: Review comments * Pull 8875: review comments + broke changeStoragePoolScope into smaller functions * Added UT for changeStoragePoolScope * Rename AbstractPrimaryDataStoreLifeCycleImpl to BasePrimaryDataStoreLifeCycleImpl * Pull 8875: Dao review comments * Pull 8875: Rename changeStoragePoolScope.vue to ChangeStoragePoolScope.vue * Pull 8875: Created a new smokes test file + A single warning msg in ui * Pull 8875: Added cleanup in test_primary_storage_scope.py * Pull 8875: Type in en.json * Pull 8875: cleanup array in test_primary_storage_scope.py * Pull:8875 Removing extra whitespace at eof of StorageManagerImplTest * Pull 8875: Added UT for PrimaryDataStoreHelper and BasePrimaryDataStoreLifeCycleImpl * Pull 8875: Added license header * Pull 8875: Fixed sql query for vmstates * Pull 8875: Changed icon plus info on disabled mode in apidoc * Pull 8875: Change scope should not work for local storage * Pull 8875: Change scope completion event * Pull 8875: Added api findAffectedVmsForStorageScopeChange * Pull 8875: Added UT for findAffectedVmsForStorageScopeChange and removed listByPoolIdVMStatesNotInCluster * Pull 8875: Review comments + Vm name in response * Pull 8875: listByVmsNotInClusterUsingPool was returning duplicate VM entries because of multiple volumes in the VM satisfying the criteria * Pull 8875: fixed listAffectedVmsForStorageScopeChange UT * listAffectedVmsForStorageScopeChange should work if the pool is not disabled * Fix listAffectedVmsForStorageScopeChangeTest UT * Pull 8875: add volume.removed not null check in VmsNotInClusterUsingPool query * Pull 8875: minor refactoring in changeStoragePoolScopeToCluster * Update server/src/main/java/com/cloud/storage/StorageManagerImpl.java * fix eof * changeStoragePoolScopeToZone should connect pool to all Up hosts Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>	2024-06-29 10:03:34 +05:30
Vishesh	90fe1d5fdc	Merge branch '4.19'	2024-06-29 03:35:24 +05:30
Vishesh	bcbf152a05	Merge branch '4.18' into 4.19	2024-06-28 20:14:21 +05:30
dahn	6b25ed7a02	prevent an NPE on an uninitialised TemplateObject (#8898 ) * prevent an NPE on an uninitialised TemplateObject * move npe handler up-stack * Update engine/storage/image/src/main/java/org/apache/cloudstack/storage/image/store/TemplateObject.java * catch yet one level up * Update engine/orchestration/src/main/java/org/apache/cloudstack/engine/orchestration/VolumeOrchestrator.java * Update engine/storage/image/src/main/java/org/apache/cloudstack/storage/image/store/TemplateObject.java * extra guard * Revert "prevent an NPE on an uninitialised TemplateObject" This reverts commit e602a65ea62e4707828483a4ddea288d81ff06f5.	2024-06-26 21:02:08 +05:30
slavkap	6c06e85c80	Temporarily backup StorPool volume before expunge (#8843 ) * Temporarily backup StorPool volume before expunge Sometimes the users delete the volumes by mistake. This enhancment provides a solution to backup the volume before it's deleted. The user will be able to see the snapshot in CloudStack UI/CLI and create only a volume from it. A task will check (by default on every 5mins) if the snapshots are deleted from StorPool Global settings to enable the delay delete option: `storpool.delete.after.interval` - The interval (in seconds) after the StorPool snapshot will be deleted `storpool.list.snapshots.delete.after.interval` - The interval (in seconds) to fetch the StorPool snapshots with deleteAfter flag Minor fix when deleting snapshots * added Apache licence * addressed comments	2024-06-26 13:58:04 +05:30
Abhisar Sinha	4eb43651e2	Ability to specify NFS mount options while adding a primary storage and modify them on a pre-existing primary storage (#8947 ) * Ability to specify NFS mount options while adding a primary storage and modify it later * Pull 8947: Rename all occurrence of nfsopt to nfsMountOpt and added nfsMountOpts to ApiConstants * Pull 8947: Refactor code - move into separate methods * Pull 8947: CollectionsUtils.isNotEmpty and switch statement in LibvirtStoragePoolDef.java * Pull 8947: UI - cancel maintainenace will remount the storage pool and apply the options * Pull 8947: UI - moved edit NFS mount options to edit Primary Storage form * Pull 8947: UI - moved 'NFS Mount Options' to below 'Type' in dataview * Pull 8947: Fixed message in AddPrimaryStorage.vue * Pull 8947: Convert _nfsmountOpts to Set in libvirtStoragePoolDef * Pull 8947: Throw exception and log error if mount fails due to incorrect mount option * Pull 8947: Added UT and moved integration test to component/maint * Pull 8947: Review comments * Pull 8947: Removed password from integration test * Pull 8947: move details allocation to inside the if loop in getStoragePoolNFSMountOpts * Pull 8947: Fixed a bug in AddPrimaryStorage.vue * Pull 8947: Pool should remain in maintenance mode if mount fails * Pull 8947: Removed password from integration test * Pull 8947: Added UT * Pull 8875: Fixed a bug in CloudStackPrimaryDataStoreLifeCycleImplTest * Pull 8875: Fixed a bug in LibvirtStoragePoolDefTest * Pull 8947: minor code restructuring * Pull 8947 : added some ut for coverage * Fix LibvirtStorageAdapterTest UT	2024-06-25 23:45:35 +05:30
Vishesh	3923f80c22	Merge branch '4.19'	2024-06-25 18:53:57 +05:30
Suresh Kumar Anaparti	620ed164d8	VMware: Improve error messaging / logs when starting non-user VMs, and secondary storage not available or doesn't have enough capacity (#9207 )	2024-06-25 12:25:42 +05:30
Rene Glover	6ee6603359	Updates to HPE-Primera and Pure FlashArray Drivers to use Host-based VLUN Assignments (#8889 ) * Updates to change PUre and Primera to host-centric vlun assignments; various small bug fixes * update to add timestamp when deleting pure volumes to avoid future conflicts * update to migrate to properly check disk offering is valid for the target storage pool * Updates to change PUre and Primera to host-centric vlun assignments; various small bug fixes * update to add timestamp when deleting pure volumes to avoid future conflicts * update to migrate to properly check disk offering is valid for the target storage pool * improve error handling when copying volumes to add precision to which step failed * rename pure volume before delete to avoid conflicts if the same name is used before its expunged on the array * remove dead code in AdaptiveDataStoreLifeCycleImpl.java * Fix issues found in PR checks * fix session refresh TTL logic * updates from PR comments * logic to delete by path ONLY on supported OUI * fix to StorageSystemDataMotionStrategy compile error * change noisy debug message to trace message * fix double callback call in handleVolumeMigrationFromNonManagedStorageToManagedStorage * fix for flash array delete error * fix typo in StorageSystemDataMotionStrategy * change copyVolume to use writeback to speed up copy ops * remove returning PrimaryStorageDownloadAnswer when connectPhysicalDisk returns false during KVMStorageProcessor template copy * remove change to only set UUID on snapshot if it is a vmSnapshot * reverting change to UserVmManagerImpl.configureCustomRootDiskSize * add error checking/simplification per comments from @slavkap * Update engine/storage/datamotion/src/main/java/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com> * address PR comments from @sureshanaparti --------- Co-authored-by: GLOVER RENE <rg9975@cs419-mgmtserver.rg9975nprd.app.ecp.att.com> Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>	2024-06-25 10:35:39 +05:30
João Jandre	3e30283500	Fix migration from local storage to NFS in KVM (#8909 ) * fix migration from local to nfs * remove unused imports * remove dead code	2024-06-24 13:02:48 +05:30
Pearl Dsilva	f792684b9c	Support migration of VM imported from a remote host (#9259 )	2024-06-24 12:46:21 +05:30
Abhishek Kumar	3e6900ac1a	api,server: purge expunged resources (#8999 ) This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine). There would be three mechanisms for purging removed resources: Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings. API - New admin-only API purgeExpungedResources. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate. Currently, API is not supported in the UI. Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge. Following new global settings have been added: expunged.resources.purge.enabled: Default: false. Whether to run a background task to purge the expunged resources expunged.resources.purge.resources: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the expunged resources. Currently only VirtualMachine is supported. An empty "value will result in considering all resource types for purging expunged.resources.purge.interval: Default: 86400. Interval (in seconds) for the background task to purge the expunged resources expunged.resources.purge.delay: Default: 300. Initial delay (in seconds) to start the background task to purge the expunged resources task. expunged.resources.purge.batch.size: Default: 50. Batch size to be used during expunged resources purging. expunged.resources.purge.start.time: Default: (empty). Start time to be used by the background task to purge the expunged resources. Use format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss. expunged.resources.purge.keep.past.days: Default: 30. The number of days in the past from the execution time of the background task to purge the expunged resources for which the expunged resources must not be purged. To enable purging expunged resource till the execution of the background task, set the value to zero. expunged.resource.purge.job.delay: Default: 180. Delay (in seconds) to execute the purging of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used. Documentation PR: apache/cloudstack-documentation#397 Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Wei Zhou <weizhou@apache.org> Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>	2024-06-20 11:34:44 +05:30
Daan Hoogland	373f017002	Merge branch '4.19'	2024-06-18 19:58:43 +02:00
Harikrishna	bb0c1f93af	Add volume encryption checks during the disk offering change (#9209 )	2024-06-17 10:36:47 +02:00
Daan Hoogland	0d8f7d4003	Merge release branch 4.19 to main * 4.19: linstor: disconnect-disk also search for resource name in Linstor (#9035) ui: add support to change Account role for admins (#9012) Use parameter dcId as wrapper to prevent NPE (#8986)	2024-05-06 10:36:06 +02:00
dahn	e520525fe7	Use parameter dcId as wrapper to prevent NPE (#8986 )	2024-05-01 09:12:36 +02:00
Daan Hoogland	e61f3bae4d	Merge branch '4.19'	2024-04-29 11:37:40 +02:00
Vishesh	80a8b80a9d	Update volume's passphrase to null if diskOffering doesn't support encryption (#8904 )	2024-04-29 12:18:09 +05:30
João Jandre	8a101fbbc1	Updating pom.xml version numbers for release 4.18.3.0-SNAPSHOT Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>	2024-04-17 11:11:57 -03:00
João Jandre	154566f914	Updating pom.xml version numbers for release 4.18.2.0 Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>	2024-04-12 08:25:04 -03:00
Abhishek Kumar	02305fbc5f	Merge remote-tracking branch 'apache/4.19'	2024-04-04 17:36:05 +05:30
Abhishek Kumar	ff3e9bd821	engine-storage: control download redirection Add a global setting to control whether redirection is allowed while downloading templates and volumes Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-04-04 14:11:05 +05:30
Wei Zhou	939d0b9011	engine-storage: control download redirection Add a global setting to control whether redirection is allowed while downloading templates and volumes core: some changes on SimpleHttpMultiFileDownloader similar as HttpTemplateDownloader Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> (cherry picked from commit b1642bc3bf58ccde9f56f632b5a9fe46a3eb5356) Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2024-04-04 11:19:20 +05:30
Abhishek Kumar	f36273888b	build: fix logger post forward-merge Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-03-01 18:14:54 +05:30
Abhishek Kumar	b29ec2bf12	Merge remote-tracking branch 'apache/4.19'	2024-03-01 17:40:58 +05:30
Harikrishna	c462be1412	New API "checkVolume" to check and repair any leaks or issues reported by qemu-img check (#8577 ) * Introduced a new API checkVolumeAndRepair that allows users or admins to check and repair if any leaks observed. Currently this is supported only for KVM * some fixes * Added unit tests * addressed review comments * add repair volume while granting access * Changed repair parameter to accept both leaks/all * Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations * Added volume check and repair changes only during VM start and volume attach operations * Refactored the names to look similar across the code * Some code fixes * remove unused code * Renamed repair values * Fixed unit tests * changed version * Address review comments * Code refactored * used volume name in logs * Changed the API to Async and the setting scope to storage pool * Fixed exit value handling with check volume command * Fixed storage scope to the setting * Fix volume format issues * Refactored the log messages * Fix formatting	2024-02-29 14:41:49 +05:30
Daan Hoogland	3baa45bc2a	forward Merge branch '4.19' into main	2024-02-26 16:00:53 +01:00
Daan Hoogland	f4987bf8ee	Merge release branch 4.18 to 4.19 * 4.18: Storage plugin support to check if volume on datastore requires access for migration (#8655) CKS: fix /opt/bin/deploy-cloudstack-secret in CKS control nodes (#8697)	2024-02-26 15:53:11 +01:00
Suresh Kumar Anaparti	f731fe882c	Storage plugin support to check if volume on datastore requires access for migration (#8655 ) * Check if volume on datastore requires access for migration, and grant/revoke volume access if requires * Updated default implementation for requiresAccessForMigration method in PrimaryDataStoreDriver	2024-02-26 20:16:31 +05:30
Abhishek Kumar	592038a304	api,server,ui: granular resource limit management (#8362 ) Feature spec: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Granular+Resource+Limit+Management Introduces the concept of tagged resource limits for granular resource limit management. Limits can be enforced on accounts and domains for the deployment of entities for a tagged resource. Current tagged resource limits can be used for the following resource types, Host limits - user_vm - cpu - memory Storage limits - volume - primary_storage Following global settings can used to specify tags for which limit needs to be enforced, Host: `resource.limit.host.tags` Storage: `resource.limit.storage.tags` Option for specifying tagged resource limits and viewing tagged resource usage are made available in the UI. Enhances the use of templatetag for VM deployment and template creation Adds option to list service/compute offerings that can be used with a given template. A new parameter named templateid has been added. Adds option to list disk offering with suitability flag for a virtual machine. A new parameter named virtualmachineid has been added to the listDiskOfferings API which when passed returns suitableforvirtualmachine param in the response.	2024-02-19 14:17:34 +05:30
Wei Zhou	6af1c25f52	Merge remote-tracking branch 'apache/4.19'	2024-02-17 12:30:40 +01:00
GaOrtiga	6f3e4e6302	fix_filter_and_pagination (#8306 ) Co-authored-by: Gabriel <gabriel.fernandes@scclouds.com.br>	2024-02-16 11:15:55 +01:00
João Jandre	49cecaed06	Normalize loggers and upgrade log4j 1.2 to log4j 2.19 (#7131 ) * Normalize logs All classes that could have their loggers inherited from their fathers had their own loggers deleted; Most loggers didn't have to be static, so most of them were normalized so that they wouldn't be; All loggers are protected now; Static logger's name are now 'LOGGER'; Non-static logger's name are now 'logger'; New class DbUpgradeAbstractImpl created so that all Upgraders extend it and inherit its logger * Upgrade log4j * fix errors caused by the merge * Refactor cglibThrowableRenderer functionality to log4j2 and upgrade the last configuration files * fix sonarcloud bug * Fix errors caused by merge, remove some unused loggers, and rename a variable that was mistakenly renamed on the normalization commit * Readd snmpTrapAppender, remove TestAppender * Regenerate changes * regenerate changes * refactor last custom appender * fix systemvm configuration xml * Regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * regenerate changes * regenerate changes * regenerate changes * Fix utils pom * fix some tests * regenerate changes * Fix jar being printed on exception * fix logging in system VMs, fix commands not having log4j2 classpath. * regenerate changes * Fix some unwanted renomeations * fix end of file * regenerate changes * regenerate changes * fix merge error * regenerate changes * fix tests * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * readd reload4j to tungsten as juniper depends on it * Regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * re-add reload4j dependency to network-contrail, as juniper depends on it * regenerate changes * regenerate changes * regenerate changes * fix typo * regenerate changes * regenerate changes * Fix end of files * regenerate changes * add logj42 to cloud-utils-SHADED.jar * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * regenerate changes * Regenerate changes * Regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * Regenerate changes * Regenerate changes * regenerate changes * Regenerate changes * Regenerate changes * fix some tests * Regenerate changes * Regenerate changes * fix test * Regenerate changes * Regenerate changes	2024-02-08 09:55:41 -03:00
Vishesh	399bd0a067	Upgrade to mockito 4 and handle Mockito deprecations (#8427 )	2024-02-06 14:20:37 +01:00
Suresh Kumar Anaparti	8ea9fc911d	StoragePoolType as class (#8544 ) * StoragePoolType as a class * Fix agent side StoragePoolType enum to class * Handle StoragePoolType for StoragePoolJoinVO * Since StoragePoolType is a class, it cannot be converted by @Enumerated annotation. Implemented conveter class and logic to utilize @Convert annotation. * Fix UserVMJoinVO for StoragePoolType * fixed missing imports * Since StoragePoolType is a class, it cannot be converted by @Enumerated annotation. Implemented conveter class and logic to utilize @Convert annotation. * Fixed equals for the enum. * removed not needed try/catch for prepareAttribute * Added license to the file. * Implemented "supportsPhysicalDiskCopy" for storage adaptor. Co-authored-by: mprokopchuk <mprokopchuk@apple.com> * Add javadoc to StoragePoolType class * Add unit test for StoragePoolType comparisons * StoragePoolType "==" and ".equals()" fix. * Fix StoragePoolType for FiberChannelAdapter * Fix for abstract storage adaptor set up issue * review comments * Pass StoragePoolType object for poolType dao attribute --------- Co-authored-by: Marcus Sorensen <mls@apple.com> Co-authored-by: mprokopchuk <mprokopchuk@apple.com> Co-authored-by: mprokopchuk <mprokopchuk@gmail.com>	2024-02-05 13:27:15 +05:30
Abhishek Kumar	7dffbc6e47	Updating pom.xml version numbers for release 4.20.0.0-SNAPSHOT Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-02-02 18:16:37 +05:30
Abhishek Kumar	a7b97ff3b0	Updating pom.xml version numbers for release 4.19.1.0-SNAPSHOT Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-02-02 18:06:04 +05:30
Abhishek Kumar	2746225b99	Updating pom.xml version numbers for release 4.19.0.0 Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2024-01-29 10:21:52 +05:30
Vishesh	fedcf66de0	Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547 ) This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release. We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status. We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds). For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand. Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below. `19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)`	2024-01-27 23:36:13 +05:30
kishankavala	80bbb29abf	CleanUp Async Jobs after mgmt server maintenance (#8394 ) This PR fixes moves resources stuck in transition state during async job cleanup Problem: During maintenance of the management server, other servers in the cluster or the same server after a restart initiate async job cleanup. However, this process leaves resources in a transitional state. The only recovery option currently available is to make direct database changes. Solution: This PR introduces a resolution by changing Volume, Virtual Machine, and Network resources from their transitional states. This adjustment enables the reattempt of failed operations without the need for manual database modifications.	2024-01-19 13:26:25 +05:30
Vishesh	c3b77cb7b8	Fix host stuck in connecting state (#8502 ) There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs. #8438 (comment) #8433 (comment) #7344 (comment) While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout). On the agent side, it gets stuck waiting for a response from the Script execution. To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query. SELECT * FROM performance_schema.metadata_locks INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID WHERE PROCESSLIST_ID <> CONNECTION_ID() \G; This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.	2024-01-15 13:56:34 +05:30

1 2 3 4 5 ...

1163 Commits