JIRA tickets heuristic estimates

Alex Shpurov
6 min readAug 10, 2022

One of the most important tasks of a Software Development Manager is to make realistic estimates on a JIRA ticket completion time. Many managers see it as a challenge quite often, especially when a new project just started and there are few or no tickets to use for reference. In this article, we will look at how a manager could employ the probability theory to help leadership teams to make these estimates more accurate and insightful.

Some agile methodologies direct us to use 1 day or less completion time, so the sprint is clean at the end of the day but we would consider when a ticket stays longer than a day because it is the most commonly used way.

Before jumping into our JIRA problem let's understand conditional probabilities and their inference. Let's say we want to estimate the probability of a person being a man if the person has long hair, P(M|L). You can note that it is much easier to estimate the reverse: the probability of having long hair if the person is a man P(L|M). the relation between these probabilities is known as Bayesian inference and could be written as

P(M|L) = P(L|M) * P(M)/P(L).

P(L) represents the number of all the people that have long hair and are called prior. we can say 60% of all people have long hair.
P(M) is the probability of a person being a man, called marginalization and we can say it is equal to roughly 50% across the population.
P(L|M) represents the likelihood and we can say the probability of a person having long hair is 20% if the person is a man.

Then our final answer, P(M|L) = 20*50/60 = 17%. It is the probability of a person being a man if the person has long hair.

Any JIRA ticket has at least 2 most important parameters. They are the duration between the moment when a ticket was opened till it was complete and a ticket complexity. The ticket complexity is measured in points, e.g. 1–5 (easy-difficult, representing the implementation effort) and it is determined by the feature team of developers during planning sessions.

Now, assume that we are looking to estimate the probability of a JIRA ticket taking 2 days if the complexity level is 3 points.

Our answer is the probability of a JIRA item being completed in 2(D) days if the complexity is 3(C) points: P(D|C). Personally, when I make estimates for developers, I manage, I realized that it is much easier to work with the reverse: the probability of complexity given a number of days P(C|D). Applying the Bayesian rule we can compute our original result: P(D|C) = P(C|D) * P(D) / P(C)

For example, for a certain project, we have the following 7 observation

JIRA observations

Then to find P(C|D) we need to filter the table where the “completed” column is equal to 2

P(C|D)

The complexity is the number representing the effort to implement it, and it is categorical, so to find the probability of complexity to be 3 we need to count the total number of occurrences which is equal to “3” for the column complexity divided by the total number of records in the column. For the array [1,3] we have only one occurrence to be 3, and the probability P(C|D) = 1/2, where 2 is the array size.

Next, we can calculate our prior, P(C), which represents the number of occurrences of “3” across all observations for the column complexity. We can see we have only 3 such occurrences (they are highlighted on the JIRA observations picture) across a total of 7 observations, which gives us P(C) = 3/7.

Finding the marginalization is similar, by taking the number of occurrences of “2” of the column completed, days dividing by the total number of observations, P(D)=2/7

Then our final result, P[D=2|C=3] = (1/2) * (2/7) / (3/7) = 0.33

The final answer is the probability of the ticket being completed in 2 days if the complexity is 3 points equals 33%. Our data is very thin, and the result may not be very accurate initially, but once you have more observations the predictions will be better and better.

What if we have no observations?

If the project has not started yet and we have no observations we still can estimate a ticket because we can theoretically estimate probabilities even without collected data. We need to look at the following:

  • Prior, P(D=2). Across all the tickets what we expect the probability would be for all the tickets that take two days to complete. Given the sprints are usually two weeks (10 working days), we can say it may be around 10% of all the tickets could be done in 2 days
  • Marginalization, P(C=3). Similar to above, across all the tickets what we expect the probability would be for all the tickets that have the complexity of three points. Given the complexity ranges for example, from 1–5, the majority of tickets will be simple, and fewer are harder and we can guess we can have the distribution [50%,20%,15%,10%,5%]. So for the P(C=3), we have 15%.
  • Likelihood, P(C=3|D=2). This is a probability of having the complexity with 3 points to be completed in 2 days. We can look into our past experience and look and some other projects can we can say it may be around 65% per sprint.

Now we have all the components and we can calculate our final probability:

P[D=2|C=3] = 65 * 10 / 15 % = 43 %

Poisson distribution

Poisson distribution also could be to estimate the completion time for a ticket, because it is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time. To make it simple we can say this is a distribution that can be used to estimate how likely it is that something will happen “X” number of times, in our case D, which is equal to 2.

In the JIRA observations picture, we can see that for C=3 we have D=[2,4,5], which gives us the mean of 3.7. Now we can use one of the online Poisson distribution calculators and use 3.7 for the mean (M) and 2 for the random number(X), the result is 0.29

As we can see the result is very similar between these 3 methods and it requires a very small fraction of data or no even not data observations to start.

Over time, we should receive more and more observations, which makes our estimates more accurate. The advantage of using probabilities also brings our initial intuition to make educated and explainable guesses.

In the end, it is worth saying that machine learning and AI methods could be used here too. However, they give the best results when we work with very large datasets, limiting their usage when we have little data.

--

--