Report average velocity and fail 50% of the time

The question of “expected velocity” and long-term planning has come up at more than one client. A recent client conversation got me thinking, however, questioning how to interpret velocity when estimating and plotting a roadmap based on a current backlog of features. Assume, for a moment, a backlog of story-pointed features, and 10 good iterations (consistent team, no odd occurrences that would affect velocity). Mathematically average velocity (well, a mean really) is a 50/50 proposition for any subsequent iteration. Some organizations don’t find this level of confidence acceptable. What velocity should be reported as expected for iteration/sprint planning and roadmap forecasting, and how should it be used?

Context

Interpreting velocity, before anything else, requires some context. An agile organization that sees estimates as hypothetical might find this article is of less use. In fact, a good question is whether estimation is even a value-added activity. For this post assume an organization that sees strong value in estimation and planning.

Culture

The biggest piece of context is to know the organizational culture. This is important in two respects, and both of these cultural factors are important because they impact how Velocity is understood within the organization.

What is Failure?

First is the meaning of failure in the organization. Is failure to deliver what was committed to by the planned date considered a failure of the team, or is it simply a fact to be understood and accounted for in future planning? Even in Agile organizations, the former is often true and a hard habit to break. If not delivering to expectations is considered failure and has negative consequences, then that means that estimation is being treated not as estimation, but as prediction and contract. Velocity is therefore a commitment, and should therefore be used conservatively.

Consistency or Speed?

The second item to know is whether consistency and predictability of delivery is of a higher strategic value than the actual rate of delivery. This is often un-stated. Usually people want fast and consistent delivery. The truth is that you can get consistent, or fast software development, or a balance between the two. Lack of trust is usually a strong motivation to encourage consistency over speed, or a history of quality problems, etc. In this case, as well, Velocity is more of a boundary than an indicator.

Emotional Loading in Estimation (or why not Low-ball?)

If estimation is seen as binding, contractual, or limiting, then additional emotions get overloaded. Trust, promise, and betrayal are words used in such organizational cultures. Distrust is usually a strong factor, especially between silos (business vs. technology, company vs. project management vs. customer, etc.). So when people are asked to give estimates, even using agile-friendly mechanisms such as story points, there is usually a process of cementing that estimate into a part of an accountability model, so estimates start to get conservative. People are then accused of low-balling, others are accused of irrational expectations… we’ve all seen this. The language clearly becomes one of contention and blame. Even the term low-balling is often an outright pejorative term for estimating too conservatively.

This doesn’t happen only in agile environments, and project managers in traditional PMBOK frameworks have long factored risk into “contingency budgets”. Interestingly, however, if a Project Manager were to factor risk into the task estimates, they’d be “low-balling capacity,” yet if they were to factor it out and layer it on top of the project work, it’s “contingency budgeting” (At least in a few experiences I’ve had). Either way, someone’s adding a factor for uncertainty, based on the need to predict conservatively or liberally or somewhere in between.

That’s the point of the article: how can Agile projects use velocity to estimate as conservatively (or liberally) as is appropriate?

An average is a 50% chance to succeed (or fail)

Velocity is not a constant. It’s a set of instantaneous values on a curve, with instances being iterations. That means that it varies, and is therefore only meaningful statistically. So how do you reasonably use velocity statistically, and improve confidence? One way is to stop delivering against “average” velocity.

A lot of coaches use average velocity over the previous N iterations. This is not helpful for all sorts of reasons, if estimation is a commitment. By definition, average (well, actually a mean, but they’re close) is a 50/50 proposition. If you report the average team velocity (assuming it’s accurate), then about half the time the team will be under and about half the time the team will be over, statistically. So basically an average is a crap shoot, when taken in any given instance. It’s can only be good in the long run. For this to work, the long-haul has to include permission to fail and a lot of trust. Teams need to be able to go miss dates but will sometimes exceed dates and it should all wash out in the end. In organizations such as I’m describing, that trust isn’t there, so. Additionally, if the language of commitment is around meeting instantaneous iteration commitments (as opposed to delivering high-quality customer value as quickly as is sustain-ably possible) then you aren’t playing the long-game, you’re playing a very short-game.

Simulate Velocity, not work

In a PMI training course I took when I was at Sun Microsystems, we were nicely informed that two point estimates of tasks are a perfect way to fail half the time, per the above logic. One point estimates are just idiotic. Three point estimates were better. We simulated with a monte-carlo algorithm and found a curve and a distribution, and then determined a confidence level yadda yadda. Well, we’re trying to avoid wasting a lot of time estimating up-front, but one way to start representing velocity properly is to do the same kind of statistical modelling done in traditional product management, only simulate velocity, not work items.

In this approach, you take the last N iterations (say 10). Determine the maximum velocity (optimistic) and the minimum velocity (pessimistic), and then the mode (the velocity value that seems to occur most frequently). Then you do monte-carlo simulation so you get a statistical pattern. Now, you actually can determine an answer based on confidence. If you want to be right with an 80% confidence, you pick a velocity where 80% of the simulated runs were successful. (Note – there are a paucity of excel templates to do this math automatically, and often they are for sale. It would be nice to have a few functions with arbitrary distributions based on min-max-mode to help this along.)

It’s not perfect, and it’s a potentially huge amount of administrative overhead. Elsewhere I’ve referenced blogs that entirely oppose any estimation at all, but if you are gong to, then working statistically with simulation is the only way to take small sample numbers meaningful.

Commitment Velocity: Low-Ball as a policy.

Another approach, one perhaps controversial, but taught by some Scrum trainers is to pick the lowest historical delivered velocity. This is a commitment-based approach, on the assumption that building trust around consistent delivery is critical to building sound relationships where product owners and teams can safely state their needs and get things done with a minimum of contractual behaviour. By taking the minimum, you force a low-ball capacity, which means you can have high-confidence of success after a few iterations. You have, likely, after a while, some spare time on your hands. Teams can then choose to pull more work in (without adjusting their commitment velocity), work on “technical debt”, improve their skills, etc. A team could raise their commitment velocity in certain inflection points in the project. A new team member is added that provides a necessary skill not previously available, and after a few iterations the team is consistently hitting a higher number, but this is a careful process to ensure that they are committing, and if they don’t make their new number, it goes down to what they got accomplished.

Indemnify teams’ learning

An arguably healthier option, if you have built enough trust, is to simply indemnify a team from failing to meet the estimate. Since you’re doing mathematics on actuals to generate an expected future number, everyone can acknowledge that past behaviour is no guarantee of future behaviour, and simply use it for capacity planning. In this case, estimation is actually estimation, not commitment or contract. The team is expected to be ahead sometimes, and behind sometimes. The upside of this is that a lot of extra time isn’t spent playing with fictional numbers. Teams are spending their efforts on delivery as quickly-yet-sustain-ably as they can, and the organization treats them as trusted professionals in this. The temptation to assume you can predict the future is seen as folly, and the estimates are used to guide overall direction, not to make outward customer commitments.

Don’t be mindless

There may be other approaches, I’m sure. The agile community is certainly not short of people who love this topic and can talk for hours on “proper” estimation. The point of this post is merely to point out some options, and ask you to look at your organizational culture, team culture, customer culture, the meaning of terms like commitment, failure, success, consistency, speed, etc. As you understand the culture, balance consistency vs. speed, trust, and other factors to choose a method of estimation that meets your goals. Don’t do estimation based on your own, internal cultural assumptions, as you may have developed or been taught techniques that are useful when and where they were taught, but may no longer be so. Or maybe they weren’t so useful then either. Regardless, this because estimation cuts at the heart of the dialogue between producer and consumer, and establishes parameters for that discussion, it’s critical that you think your choice through.

[Christian also blogs at http://www.geekinasuit.com/]


Affiliated Promotions:

Try our automated online Scrum coach: Scrum Insight - free scores and basic advice, upgrade to get in-depth insight for your team. It takes between 8 and 11 minutes for each team member to fill in the survey, and your results are available immediately. Try it in your next retrospective.

Please share!
Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

3 thoughts on “Report average velocity and fail 50% of the time”

  1. This is a great blog entry and certainly makes you think about the intricacies of velocity and how you use it.

    I however tend to rather focus on the practical aspects of velocity and what it’s there for.

    In my opinion, velocity (whether you’re using average velocity, last sprints velocity (whatever), it’s really to be used as a guideline for determining how much work you can get done in an iteration (and therefore extended to the release).

    It’s generally accepted in Scrum circles that some sprints you’re going to get more done and some less. Bottom line is, the fact that you’re iterating means you’re getting some stuff done more frequently and most likely more on point than doing it any other way and this builds trust in the organization.

    I prefer not to get caught up in the theoretical aspects on this topic.

    Scrum more so than most is a practical approach to software development. So i prefer to be practical in how we use things like velocity

    My 2 cents
    Jack
    http://www.agilebuddy.com

  2. Fair enough, Jack. And personally, teams I’m working on in trusting environments usually don’t need to consider such things very much (which is why I left option 3: indemnify teams’ learning). But a lot of my clients are implementing Agile within a very non agile company, and “we’re getting some stuff done each iteration” is no comfort, as they haven’t bought into the benefits. They see fast-moving, (seemingly) uncontrolled teams who had a history of not delivering, so they’re nervous. In these environments (which are all too common in my own consulting experience) you need to carefully craft metrics to purpose, in order to engineer cultural changes.

    So please don’t worry about it if you can afford not to! And kudo’s to your organization if they can have this attitude. πŸ™‚

  3. Excellent post, Christian, and one that brought back so many memories for me! I had an especially strong reaction to:

    “In this case, estimation is actually estimation, not commitment or contract. The team is expected to be ahead sometimes, and behind sometimes. The upside of this is that a lot of extra time isn’t spent playing with fictional numbers. Teams are spending their efforts on delivery as quickly-yet-sustain-ably as they can, and the organization treats them as trusted professionals in this. The temptation to assume you can predict the future is seen as folly, and the estimates are used to guide overall direction, not to make outward customer commitments.”

    because you’ve hit the nail squarely on its head right there. Failure to understand those few sentences, as we’ve both experienced in our travels, can actually make the difference between succeeding and failing at your attempt to “go Agile.”

    Posts like this are invaluable…. in fact, you should write a book! πŸ˜‰

Leave a Reply

Your email address will not be published. Required fields are marked *