Connection aborted issues

Problem

Error while running playbook (deploying services or infrastructure)

TASK [consul-env : register instance type] *************************************
fatal: [52.49.58.215]: FAILED! => {"changed": false, "failed": true, "msg": "Could not connect to consul agent at localhost:11580, error was ('Connection aborted.', BadStatusLine(\"''\",))"}

Solution

This normally means that consul is experiencing cluster rift due to docker conntrack issue. To solve this issue it is necessary to upgrade to latest docker-agent and consul and redeploy these services.

  • Install latest docker agent (Athena shell)
athena-services docker-agent
  • Install latest consul (Athena shell)
athena-services consul
  • Stop consul in all machines except for AccessGateway instance. In a particular stop in (Backoffice, Public, Internal, Exchange) machine (Instance shell)
sudo docker stop consul
sudo docker stop consul && sudo docker run -it --volumes-from consul-data busybox sh -c 'rm -rf /var/consul/*' && sudo docker start consul
  • Initialize AccessGateway consul (Athena shell)
athena-infrastructure vpc,consul
  • Restart consul on all nodes (Athena shell)
athena-services consul
  • To make sure that all services get properly re-registered run (assuming that all used services are listed in services-.roles) (Athena shell)
athena-services