Tags

JIRA is a great tool for Scrum/Kanban and it can actually be used as a tracking tool for any set of activities which have a defined workflow. Over a period of last 2-3 years atlassian has introduced lot of new features, latest been Rapid Board, Workflow designer and many more which are worth exploring for anyone who is looking for Ticketing/Bug-tracking/Project-managing tool. In this blog I would only be talking about one specific problem related to deployment and high-availability of JIRA as a service and how I solved it.

Problem

For one of my clients, I did the migration of their old JIRA 3.13 installation to JIRA 4.3.2 and during that period I investigated options of doing clustering of JIRA so as to provide high-availablity and scalability. And this is one thing really annoying thing about JIRA that they don’t provide any options for clustering of JIRA at both application level or Operating system level. So you can’t have a setup in which JIRA is installed on two servers and have a load balancer forward the request based on work load.

So only option for scalability is to scale vertically, but for high-availablity I tried some tricks using GlusterFS & HeartBeat, which allows me now to have a passive node on which I can start JIRA automatically when the active JIRA node goes down.

Solution

The trick was to use GlusterFS as a filesystem to provide data replication between two servers, so that all the attachments and other  application data is always replicated in real time on two physical servers.This approach is better then taking daily or hourly backup and using that data in case the server goes down and starting a new server with that data, because that would mean a downtime of atleast few hours and might also involve some data loss.


But the issue after that was since only one instance of JIRA can be up at any moment, we used HeartBeat to handle that part. Heatbeat is a daemon which runs on both the servers and watches the presense (or disapperance) of the other peer. We configure one peer as the master which would initially run the JIRA service, but in case that server goes down Heartbeat will be notified and Heartbeat will immediately start JIRA on the slave/passive node and make it the new master.

So the above setup of GlusterFS+Heartbeat makes sure the data is synchronized between both the machines and as soon as one machine goes down, within seconds the JIRA service would be started on the other machine. Along with that, we run JIRA on a Virtual IP address(VIP), so heatbeat will also make sure that when the JIRA service is started on the slave node, it is also assigned the same VIP which was initially held by the master server.

And for the database we have a Mysql replication setup using Multi-Master Replication Manager which provides a Virtual IP address to which JIRA can connect. This setup makes sure that the exactly one Mysql server(master) is available for read/write from JIRA and the other server will just replicate data from the master server.

This approach makes sure we are always conistent even in case of a hardware failure, with only a downtime of few minutes. We tried this solution for few months and the performance was pretty good even though we have a realtime synchronization of data.

In the next post, I will explain the configurations which we used for GlusterFS & Heartbeat to make everything work and also other solutions which can be tried.

Advertisements