mirror of
https://github.com/apache/cloudstack.git
synced 2025-11-03 04:12:31 +01:00
Merge pull request #863 from borisroman/CLOUDSTACK-8883
[4.6][BLOCKER]CLOUDSTACK-8883: Resolved connect/reconnect issue.Hi! @wilderrodrigues by implementing Callable you switched a couple of methods and fields. I switched them some more! The reason why the Agent wouldn't reconnect was due to two facts. Problem 1: Selector was blocking. In the while loop at [1] _selector.select(); was blocking when the connection was lost. This means at [2] _isStartup = false; was never excecuted. Therefore at [3] the call to isStartup() always returned true resulting in an infinite loop. Resolution 1: Move the call to cleanUp() [4] before checking if isStartup() has turned to false. cleanUp() will close() the _selector resulting in _isStartup to be set to false. Problem 2: Setting _isStartup & _isRunning to true when init() throwed an unchecked exception (ConnectException). The exception was nicely caught, but only logged. No action was taken! Resulting in _isStartup & _isRunning being set to true. Resulting in the fact the Agent thought it was connected successfully, though it wasn't. Resolution 2: Adding return to the catch statement [5]. This way _isStartup & _isRunning aren't set to true. Steps to test: 1. Deploy ACS. 2. Try all combinations of stopping/starting managment server/agent. [1]b34f86c8d5/utils/src/main/java/com/cloud/utils/nio/NioConnection.java (L128)[2]b34f86c8d5/utils/src/main/java/com/cloud/utils/nio/NioConnection.java (L176)[3]b34f86c8d5/agent/src/com/cloud/agent/Agent.java (L404)[4]b34f86c8d5/agent/src/com/cloud/agent/Agent.java (L399)[5]b34f86c8d5/utils/src/main/java/com/cloud/utils/nio/NioConnection.java (L91)* pr/863: Added return statement to stop start() if there has been an ConnectException. Call cleanUp() before looping isStartup(). Signed-off-by: Rajani Karuturi <rajani.karuturi@citrix.com>
This commit is contained in:
commit
1a474374b9
@ -394,15 +394,17 @@ public class Agent implements HandlerFactory, IAgentControl {
|
||||
} while (inProgress > 0);
|
||||
|
||||
_connection.stop();
|
||||
while (_connection.isStartup()) {
|
||||
_shell.getBackoffAlgorithm().waitBeforeRetry();
|
||||
}
|
||||
|
||||
try {
|
||||
_connection.cleanUp();
|
||||
} catch (final IOException e) {
|
||||
s_logger.warn("Fail to clean up old connection. " + e);
|
||||
}
|
||||
|
||||
while (_connection.isStartup()) {
|
||||
_shell.getBackoffAlgorithm().waitBeforeRetry();
|
||||
}
|
||||
|
||||
_connection = new NioClient("Agent", _shell.getHost(), _shell.getPort(), _shell.getWorkers(), this);
|
||||
do {
|
||||
s_logger.info("Reconnecting...");
|
||||
|
||||
@ -88,6 +88,7 @@ public abstract class NioConnection implements Callable<Boolean> {
|
||||
init();
|
||||
} catch (final ConnectException e) {
|
||||
s_logger.warn("Unable to connect to remote: is there a server running on port " + _port);
|
||||
return;
|
||||
} catch (final IOException e) {
|
||||
s_logger.error("Unable to initialize the threads.", e);
|
||||
throw new NioConnectionException(e.getMessage(), e);
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user