Tag Archives: data warehousing

Scrum Data Warehouse Project

May people have concerns about the possibility of using Scrum or other Agile methods on large projects that don’t directly involve software development.  Data warehousing projects are commonly brought up as examples where, just maybe, Scrum wouldn’t work.
I have worked as a coach on a couple of such projects.  Here is a brief description of how it worked (both the good and the bad) on one such project:
The project was a data warehouse migration from Oracle to Teradata.  The organization had about 30 people allocated to the project.  Before adopting Scrum, they had done a bunch of up-front analysis work.  This analysis work resulted in a dependency map among approximately 25,000 tables, views and ETL scripts.  The dependency map was stored in an MS Access DB (!).  When I arrived as the coach, there was an expectation that the work would be done according to dependencies and that the “team” would just follow that sequence.
I learned about this all in the first week as I was doing boot-camp style training on Scrum and Agile with the team and helping them to prepare for their first Sprint.
I decided to challenge the assumption about working based on dependencies.  I spoke with the Product Owner about the possible ways to order the work based on value.  We spoke about a few factors including:
  • retiring Oracle data warehouse licenses / servers,
  • retiring disk space / hardware,
  • and saving CPU time with new hardware
The Product Owner started to work on getting metrics for these three factors.  He was able to find that the data was available through some instrumentation that could be implemented quickly so we did this.  It took about a week to get initial data from the instrumentation.
In the meantime, the Scrum teams (4 of them) started their Sprints working on the basis of the dependency analysis.  I “fought” with them to address the technical challenges of allowing the Product Owner to work on the migration in order based more on value – to break the dependencies with a technical solution.  We discussed the underlying technologies for the ETL which included bash scripts, AbInitio and a few other technologies.  We also worked on problems related to deploying every Sprint including getting approval from the organization’s architectural review board on a Sprint-by-Sprint basis.  We also had the teams moved a few times until an ideal team workspace was found.
After the Product Owner found the data, we sorted (ordered) the MS Access DB by business value.  This involved a fairly simple calculation based primarily on disk space and CPU time associated with each item in the DB.  This database of 25000 items became the Product Backlog.  I started to insist to the teams that they work based on this order, but there was extreme resistance from the technical leads.  This led to a few weeks of arguing around whiteboards about the underlying data warehouse ETL technology.  Fundamentally, I wanted to the teams to treat the data warehouse tables as the PBIs and have both Oracle and Teradata running simultaneously (in production) with updates every Sprint for migrating data between the two platforms.  The Technical team kept insisting this was impossible.  I didn’t believe them.  Frankly, I rarely believe a technical team when they claim “technical dependencies” as a reason for doing things in a particular order.
Finally, after a total of 4 Sprints of 3 weeks each, we finally had a breakthrough.  In a one-on-one meeting, the most senior tech lead admitted to me that what I was arguing was actually possible, but that the technical people didn’t want to do it that way because it would require them to touch many of the ETL scripts multiple times – they wanted to avoid re-work.  I was (internally) furious due to the wasted time, but I controlled my feelings and asked if it would be okay if I brought the Product Owner into the discussion.  The tech lead allowed it and we had the conversation again with the PO present.  The tech lead admitted that breaking the dependencies was possible and explained how it could lead to the teams touching ETL scripts more than once.  The PO basically said: “awesome!  Next Sprint we’re doing tables ordered by business value.”
A couple Sprints later, the first of 5 Oracle licenses was retired, and the 2-year $20M project was a success, with nearly every Sprint going into production and with Oracle and Teradata running simultaneously until the last Oracle license was retired.  Although I don’t remember the financial details anymore, the savings were huge due to the early delivery of value.  The apprentice coach there went on to become a well-known coach at this organization and still is a huge Agile advocate 10 years later!

Affiliated Promotions:

Register for a Scrum, Kanban and Agile training sessions for your, your team or your organization -- All Virtual! Satisfaction Guaranteed!

Please share!
Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail
Berteig
Upcoming Courses
View Full Course Schedule
Scrum Master Bootcamp with CSM® (Certified Scrum Master®) [Virtual Learning] (SMBC)
Online
C$1895.00
Jun 7
2023
Details
Real Agility™ Team Performance Coaching with BERTEIG (COACHING-TPC)
Online
C$750.00
Jun 9
2023
Details
Real Agility™ Real Agility™ Ask Me Anything / Coaching
Online
C$750.00
Jun 9
2023
Details
Win as a Manager with Your New AI Coach: Max Good (WEBINAR)
Online
C$0.00
Jun 9
2023
Details
Kanban System Design® (KMPI)
Online
C$1895.00
Jun 13
2023
Details
Product Owner Bootcamp with CSPO® (Certified Scrum Product Owner®) [Virtual Learning] (POBC)
Online
C$1270.75
Jun 14
2023
Details
Real Agility™ Team Performance Coaching with BERTEIG (COACHING-TPC)
Online
C$750.00
Jun 16
2023
Details
Real Agility™ Real Agility™ Ask Me Anything / Coaching
Online
C$750.00
Jun 16
2023
Details
Kanban for Scrum Masters (ML-KSM)
Online
C$495.00
Jun 20
2023
Details
Kanban for Product Owners (ML-KPO)
Online
C$495.00
Jun 21
2023
Details
Team Kanban Practitioner® (TKP)
Online
C$1015.75
Jun 21
2023
Details
Real Agility™ Team Performance Coaching with BERTEIG (COACHING-TPC)
Online
C$750.00
Jun 23
2023
Details
Real Agility™ Real Agility™ Ask Me Anything / Coaching
Online
C$750.00
Jun 23
2023
Details
Real Agility™ Team Performance Coaching with BERTEIG (COACHING-TPC)
Online
C$750.00
Jun 30
2023
Details
Real Agility™ Real Agility™ Ask Me Anything / Coaching
Online
C$750.00
Jun 30
2023
Details
Scrum Master Bootcamp with CSM® (Certified Scrum Master®) [Virtual Learning] (SMBC)
Online
C$1270.75
Jul 5
2023
Details
Kanban Systems Improvement® (KMPII)
Online
C$1610.75
Jul 11
2023
Details
Product Owner Bootcamp with CSPO® (Certified Scrum Product Owner®) [Virtual Learning] (POBC)
Online
C$1270.75
Jul 12
2023
Details
Team Kanban Practitioner® (TKP)
Online
C$1015.75
Jul 19
2023
Details
Advanced Certified ScrumMaster® (ACSM)
Online
C$845.75
Jul 19
2023
Details
Scrum Master Bootcamp with CSM® (Certified Scrum Master®) [Virtual] (SMBC)
Online
C$1270.75
Aug 2
2023
Details
Team Kanban Practitioner® (TKP)
Online
C$1015.75
Aug 15
2023
Details
Product Owner Bootcamp with CSPO® (Certified Scrum Product Owner®) [Virtual Learning] (POBC)
Online
C$1270.75
Aug 16
2023
Details
Scrum Master Bootcamp with CSM® (Certified Scrum Master®) [Virtual] (SMBC)
Online
C$1270.75
Sep 19
2023
Details
Product Owner Bootcamp with CSPO® (Certified Scrum Product Owner®) [Virtual Learning] (POBC)
Online
C$1270.75
Sep 26
2023
Details