agent: Fixes #2633 don't wait for pending tasks on reconnection (#2638)

When agent loses connection with management server, the reconnection
logic waits for any pending tasks to finish. However, when such tasks
do finish they fail to send an `Answer` back to managements server.
Therefore from a management server's perspective such pending
operations are stuck in a FSM state and need manual removal or fixing.
This is by design where management server's side cmd-answer request
pattern is code/execution dependent, therefore even if the answer
were to be sent when management server came back up (reconnects)
the management server will fail to acknowledge and process the answer
due to missing listeners or being in the exact state to handle answers.

Historically, the Agent would wait to reconnect until the internal
tasks complete but I found no reason why it should wait for reconnection
at all.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This commit is contained in:
Rohit Yadav 2018-05-16 15:35:00 +05:30 committed by GitHub
parent f663b926c7
commit d893fb5b00
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -495,19 +495,7 @@ public class Agent implements HandlerFactory, IAgentControl {
_resource.disconnected();
final String lastConnectedHost = _shell.getConnectedHost();
int inProgress = 0;
do {
_shell.getBackoffAlgorithm().waitBeforeRetry();
s_logger.info("Lost connection to host: " + lastConnectedHost + ". Dealing with the remaining commands...");
inProgress = _inProgress.get();
if (inProgress > 0) {
s_logger.info("Cannot connect because we still have " + inProgress + " commands in progress.");
}
} while (inProgress > 0);
s_logger.info("Lost connection to host: " + _shell.getConnectedHost() + ". Attempting reconnection while we still have " + _inProgress.get() + " commands in progress.");
_connection.stop();