The question of “expected velocity” and long-term planning has come up at more than one client. A recent client conversation got me thinking, however, questioning how to interpret velocity when estimating and plotting a roadmap based on a current backlog of features. Assume, for a moment, a backlog of story-pointed features, and 10 good iterations (consistent team, no odd occurrences that would affect velocity). Mathematically average velocity (well, a mean really) is a 50/50 proposition for any subsequent iteration. Some organizations don’t find this level of confidence acceptable. What velocity should be reported as expected for iteration/sprint planning and roadmap forecasting, and how should it be used?
Interpreting velocity, before anything else, requires some context. An agile organization that sees estimates as hypothetical might find this article is of less use. In fact, a good question is whether estimation is even a value-added activity. For this post assume an organization that sees strong value in estimation and planning.
The biggest piece of context is to know the organizational culture. This is important in two respects, and both of these cultural factors are important because they impact how Velocity is understood within the organization.
What is Failure?
First is the meaning of failure in the organization. Is failure to deliver what was committed to by the planned date considered a failure of the team, or is it simply a fact to be understood and accounted for in future planning? Even in Agile organizations, the former is often true and a hard habit to break. If not delivering to expectations is considered failure and has negative consequences, then that means that estimation is being treated not as estimation, but as prediction and contract. Velocity is therefore a commitment, and should therefore be used conservatively.
Consistency or Speed?
The second item to know is whether consistency and predictability of delivery is of a higher strategic value than the actual rate of delivery. This is often un-stated. Usually people want fast and consistent delivery. The truth is that you can get consistent, or fast software development, or a balance between the two. Lack of trust is usually a strong motivation to encourage consistency over speed, or a history of quality problems, etc. In this case, as well, Velocity is more of a boundary than an indicator.
Emotional Loading in Estimation (or why not Low-ball?)
If estimation is seen as binding, contractual, or limiting, then additional emotions get overloaded. Trust, promise, and betrayal are words used in such organizational cultures. Distrust is usually a strong factor, especially between silos (business vs. technology, company vs. project management vs. customer, etc.). So when people are asked to give estimates, even using agile-friendly mechanisms such as story points, there is usually a process of cementing that estimate into a part of an accountability model, so estimates start to get conservative. People are then accused of low-balling, others are accused of irrational expectations… we’ve all seen this. The language clearly becomes one of contention and blame. Even the term low-balling is often an outright pejorative term for estimating too conservatively.
This doesn’t happen only in agile environments, and project managers in traditional PMBOK frameworks have long factored risk into “contingency budgets”. Interestingly, however, if a Project Manager were to factor risk into the task estimates, they’d be “low-balling capacity,” yet if they were to factor it out and layer it on top of the project work, it’s “contingency budgeting” (At least in a few experiences I’ve had). Either way, someone’s adding a factor for uncertainty, based on the need to predict conservatively or liberally or somewhere in between.
That’s the point of the article: how can Agile projects use velocity to estimate as conservatively (or liberally) as is appropriate?
An average is a 50% chance to succeed (or fail)
Velocity is not a constant. It’s a set of instantaneous values on a curve, with instances being iterations. That means that it varies, and is therefore only meaningful statistically. So how do you reasonably use velocity statistically, and improve confidence? One way is to stop delivering against “average” velocity.
A lot of coaches use average velocity over the previous N iterations. This is not helpful for all sorts of reasons, if estimation is a commitment. By definition, average (well, actually a mean, but they’re close) is a 50/50 proposition. If you report the average team velocity (assuming it’s accurate), then about half the time the team will be under and about half the time the team will be over, statistically. So basically an average is a crap shoot, when taken in any given instance. It’s can only be good in the long run. For this to work, the long-haul has to include permission to fail and a lot of trust. Teams need to be able to go miss dates but will sometimes exceed dates and it should all wash out in the end. In organizations such as I’m describing, that trust isn’t there, so. Additionally, if the language of commitment is around meeting instantaneous iteration commitments (as opposed to delivering high-quality customer value as quickly as is sustain-ably possible) then you aren’t playing the long-game, you’re playing a very short-game.
Simulate Velocity, not work
In a PMI training course I took when I was at Sun Microsystems, we were nicely informed that two point estimates of tasks are a perfect way to fail half the time, per the above logic. One point estimates are just idiotic. Three point estimates were better. We simulated with a monte-carlo algorithm and found a curve and a distribution, and then determined a confidence level yadda yadda. Well, we’re trying to avoid wasting a lot of time estimating up-front, but one way to start representing velocity properly is to do the same kind of statistical modelling done in traditional product management, only simulate velocity, not work items.
In this approach, you take the last N iterations (say 10). Determine the maximum velocity (optimistic) and the minimum velocity (pessimistic), and then the mode (the velocity value that seems to occur most frequently). Then you do monte-carlo simulation so you get a statistical pattern. Now, you actually can determine an answer based on confidence. If you want to be right with an 80% confidence, you pick a velocity where 80% of the simulated runs were successful. (Note – there are a paucity of excel templates to do this math automatically, and often they are for sale. It would be nice to have a few functions with arbitrary distributions based on min-max-mode to help this along.)
It’s not perfect, and it’s a potentially huge amount of administrative overhead. Elsewhere I’ve referenced blogs that entirely oppose any estimation at all, but if you are gong to, then working statistically with simulation is the only way to take small sample numbers meaningful.
Commitment Velocity: Low-Ball as a policy.
Another approach, one perhaps controversial, but taught by some Scrum trainers is to pick the lowest historical delivered velocity. This is a commitment-based approach, on the assumption that building trust around consistent delivery is critical to building sound relationships where product owners and teams can safely state their needs and get things done with a minimum of contractual behaviour. By taking the minimum, you force a low-ball capacity, which means you can have high-confidence of success after a few iterations. You have, likely, after a while, some spare time on your hands. Teams can then choose to pull more work in (without adjusting their commitment velocity), work on “technical debt”, improve their skills, etc. A team could raise their commitment velocity in certain inflection points in the project. A new team member is added that provides a necessary skill not previously available, and after a few iterations the team is consistently hitting a higher number, but this is a careful process to ensure that they are committing, and if they don’t make their new number, it goes down to what they got accomplished.
Indemnify teams’ learning
An arguably healthier option, if you have built enough trust, is to simply indemnify a team from failing to meet the estimate. Since you’re doing mathematics on actuals to generate an expected future number, everyone can acknowledge that past behaviour is no guarantee of future behaviour, and simply use it for capacity planning. In this case, estimation is actually estimation, not commitment or contract. The team is expected to be ahead sometimes, and behind sometimes. The upside of this is that a lot of extra time isn’t spent playing with fictional numbers. Teams are spending their efforts on delivery as quickly-yet-sustain-ably as they can, and the organization treats them as trusted professionals in this. The temptation to assume you can predict the future is seen as folly, and the estimates are used to guide overall direction, not to make outward customer commitments.
Don’t be mindless
There may be other approaches, I’m sure. The agile community is certainly not short of people who love this topic and can talk for hours on “proper” estimation. The point of this post is merely to point out some options, and ask you to look at your organizational culture, team culture, customer culture, the meaning of terms like commitment, failure, success, consistency, speed, etc. As you understand the culture, balance consistency vs. speed, trust, and other factors to choose a method of estimation that meets your goals. Don’t do estimation based on your own, internal cultural assumptions, as you may have developed or been taught techniques that are useful when and where they were taught, but may no longer be so. Or maybe they weren’t so useful then either. Regardless, this because estimation cuts at the heart of the dialogue between producer and consumer, and establishes parameters for that discussion, it’s critical that you think your choice through.
[Christian also blogs at http://www.geekinasuit.com/]