Ansible : Maintain Consistent Production Deployments using Bamboo and Ansible Tower

Before I start, I would like you guys should know the problem statement.

Problem Statement:

  • By default Ansible will deploy to a group of servers and no failure is reported even if there are any unreachable hosts.
  • By default Ansible will not report if a task is failing on a server(from a group) and proceed ahead with the deployment.

Current Toolsets Used:

  • Bamboo: CI server and used for triggering Deployments on Ansible Tower.(we created custom Plugin in Bamboo to support Ansible Tower deployments)
  • Ansible Tower: Ansible Tower is a web based application which acts as hub for all automation Task.

In my scenario, we are using Ansible to deploy on group of Windows/Linux servers. while doing the deployments there are groups of production servers and on top of them there is a load balancer to manage the traffic. As we are using Bamboo as CI server, we initiate the deployment which triggers an API on Ansible Tower and starts the deployment.

If Ansible finds a server unreachable, it will skip running all the tasks on that server and completes the deployment and at the end it will show a deployment summary of what all servers it deployed to and which servers the deployment done and what all tasks are failed. In Bamboo we will get a green status which represents a successful deployment. This makes the production servers inconsistent in application versions and which can cause lots of business issues.

How we can solve this problem ?

  • The deployment should fail if there are any unreachable servers while doing deployments.
  • The deployment should fail if any task fails on any of the servers specified in the group.

How we can achieve this with Ansible?

There are few configuration option present in Ansible. Let’s see how this will help us to achieve this.

MAX_FAIL_PERCENTAGE: By default, Ansible will continue executing actions as long as there are hosts in the group that have not yet failed. In some situations, such as with the rolling updates described above, it may be desirable to abort the play when a certain threshold of failures have been reached.

- hosts: webservers
max_fail_percentage: 30
serial: 10

if more than 3 of the 10 servers in the group were to fail, the rest of the play would be aborted.

  • max_fail_percentage cannot detect unreachable servers.

ANY_ERRORS_FATAL : This will abort the playbook for any failure. This also detects unreachable servers and marks the playbook failed.

Let’s how can we use the same in our Playbook. This is the default playbook and below is the output.

---
- hosts: webservers
max_fail_percentage: 0
tasks:
- name: Copy File from one location to another
copy:
src: /tmp/file1
dest: /tmp/file2
remote_src: True
- name: debug
debug:
msg: " Hey i still ran "
Default Output of the Playbook without adding any configurations

So, here I have created a simple playbook to show how we can solve the problems mentioned above. Let’s create a simple playbook and add the following content. Here we will try to solve one of the issue specified above( Playbook should abort if any task fails on any of the server specified in the group).

---
- hosts: webservers
max_fail_percentage: 0
tasks:
- name: Copy File from one location to another
copy:
src: /tmp/file1
dest: /tmp/file2
remote_src: True
- name: debug
debug:
msg: " Hey i still ran "

Here we have added “max_fail_percentage” and specified the percentage as 0 which means any failure will abort this playbook.

ansible-playbook -i hosts site.yml 
Output of the playbook, Marked in green shows unreachable servers and still continue the play, and abort if any task fails.

Here if you notice, there are 1 unreachable server and still Ansible continues to run the playbook however, it fails the play as it identifies a task failure on one server because of “max_fail_percentage” defined as 0. This solves our one of the problem but we wanted to abort our deployment if there are any unreachable servers. To accomplish this lets modify our playbook with one more configuration as “any_errors_fatal” to True.

---
- hosts: webservers
any_errors_fatal: True
max_fail_percentage: 0
tasks:
- name: Copy File from one location to another
copy:
src: /tmp/file1
dest: /tmp/file2
remote_src: True
- name: debug
debug:
msg: " Hey i still ran "

So here we added “any_errors_fatal” to True and lets run our play.

ansible-playbook -i hosts site.yml
Here as soon as it identifies unreachable servers, it aborts the play and mark it as failure.

Here, it aborts as soon as it identifies unreachable servers and mark the playbook as failed. This way we have resolved all the problems stated above.

Now whenever we run the deployments from Bamboo, it triggers the Ansible Tower Deployment and fails if it identifies any of such conditions and this way we maintain consistent deployments for all application teams.

Hope this will help you guys and also if you have any better solution please comment.

--

--

--

DevOps Consultant, ALM Consultant, CloudOps Consultant

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

September 22nd, 2021, Ecosystem Updates for SORA, Polkaswap, and Fearless Wallet

Designing the test pyramid

Interview with a DevCareer Alumnus: Fatima Muhammad, a Frontend Developer at Omniswift Nigeria…

Fatima Muhammad

A Reason to Hodl ESW Tokens

Vapor — Fluent as database ORM

Google Summer of Code: OpenMF Week 8

Java IO Tutorial — Java I/O Buffer

Hello Everyone,

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Abhijeet Kamble

Abhijeet Kamble

DevOps Consultant, ALM Consultant, CloudOps Consultant

More from Medium

Creating “Hello World” CI/CD Pipeline on Gitlab for testers.

WSO2 Identity Server deployment with Ansible — Part II

JFrog Artifactory cli usage

WSO2 API Manager & GLUU SSO with SAML2