cloudstack

mirror of https://github.com/apache/cloudstack.git synced 2025-10-26 08:42:29 +01:00

Author	SHA1	Message	Date
John Bampton	28e8e2d009	pre-commit: add hook to trim trailing whitespace (#8205 )	2024-05-28 09:01:30 +02:00
Gabriel Beims Bräscher	b4db3db617	Use default timeout and retransmission values for the NFS mount. (#6019 ) This also allows the mount command to apply NFS mount custom values set by ADMINS via '/etc/nfsmount.conf'.	2022-03-02 09:07:08 -03:00
Daniel Augusto Veronezi Salvador	82df04ecc8	Improve HA logs (#5241 ) Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>	2021-07-30 21:13:16 +02:00
Rohit Yadav	c6e53f6cc6	kvm: reset KVM host on heartbeat failure (#2984 ) On actual testing, I could see that kvmheartbeat.sh script fails on NFS server failure and stops the agent only. Any HA VMs could be launched in different hosts, and recovery of NFS server could lead to a state where a HA enabled VM runs on two hosts and can potentially cause disk corruptions. In most cases, VM disk corruption will be worse than VM downtime. I've kept the sleep interval between check/rounds but reduced it to 10s. The change in behaviour was introduced in #2722. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2018-10-30 15:13:59 +05:30
Slair1	023dcec5ef	CLOUDSTACK-10310 Fix KVM reboot on storage issue (#2722 )	2018-08-20 10:28:03 +02:00
Remi Bergsma	7bce656b40	make sure sync cannot block reboot The recent discussed improvement has the risk that if 'sync' hangs, the reboot may be delayed in the same way as the 'reboot' command would do. To work around, we're adding a 5 second timeout. If it cannot sync in 5 seconds, it will not succeed anyway and we should proceed the reset. @snuf: Could we use your OVM3 heartbeat script for other hypervisors as well? One way to do it seems like a nice idea :-)	2015-04-09 12:18:21 +02:00
Remi Bergsma	c59308b0ee	write logfile just before rebooting the host As discussed with @wido @pyr and @nuxro added an extra log line. Tested it and it logs fine (tested to local disk) when syncing first: Apr 3 15:31:23 mcctest2 heartbeat: kvmheartbeat.sh system because it was unable to write the heartbeat to the storage By the way, it did also log to the agent.log but this extra log has the benefit of ending up in the system log so you'll probably find it easier there. Existing logs: 2015-04-03 15:27:23,943 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 0 2015-04-03 15:28:23,944 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 1 2015-04-03 15:29:23,946 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 2 2015-04-03 15:30:23,948 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 3 2015-04-03 15:31:23,950 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 4 2015-04-03 15:31:23,950 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout; reboot the host This closes #145 Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2015-04-04 14:17:37 +05:30
Remi Bergsma	2b41f98346	reboot much faster in case of storage failure When storage cannot be reached, it does not make sense to reboot as it will try to flush buffers, umount NFS mounts, etc. This will not work and thus cause a long delay. With this change, the box will reboot immediately (like pressing the reset button).	2015-04-01 19:45:16 +02:00
Edison Su	f497c7c031	Bug: HA takes a lot of time to migrate VMs (trigger HA) to another KVM host if there are multiple storage pools in a cluster. The issue is as follows: 1. When CloudStack detects that a host is not responding to ping requests it'll send a fence command for this host to another host in the cluster. 2. The agent takes a long time to respond to this check if the storage is fenced. This is because the agent checks if the first host is writing to its heartbeat file on all pools in the cluster. It is doing this in a sequential manner on all storage pool. Making a fix to get rid of sleep, wait during HA. The behavior is now similar to Xenserver. RB: https://reviews.apache.org/r/6133/ Send-by:devdeep.singh@citrix.com	2012-07-25 10:17:09 -07:00
David Nalley	d630fa8697	license header changes for scripts folder from Chip Childers	2012-06-23 00:58:00 -04:00
frank	2f634c0913	Switch to Apache license	2012-04-03 04:50:05 -07:00
frank	52610ffcb3	add copyright header to shell scripts	2012-01-11 18:41:53 -08:00
Edison Su	47380dc20e	fix add host	2011-05-12 15:03:15 -04:00
Edison Su	d8ee7d9fc3	if storage network disconnected, reboot the host	2011-04-14 17:46:54 -04:00
Frank	92155522f2	Add license header to files	2011-04-14 11:23:14 -07:00
edison	007783f6cf	add more logs when taking heartbeat, and make ha enabled even in oss	2010-11-10 09:49:03 -08:00
edison	4bc63e5c32	Enable KVM HA on nfs storage	2010-11-09 22:03:22 -08:00

17 Commits