From the CEO of Tasktop Technologies

Mik Kersten

Subscribe to Mik Kersten: eMailAlertsEmail Alerts
Get Mik Kersten: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Continuous Integration, DevOps Journal

Blog Post

Sparking True Software-Based Business Innovation By @Mik_Kersten | @DevOpsSummit #DevOps

The software industry is starting to go through a transformation in our understanding of the use of metrics

I've long taken inspiration from Peter Drucker's caution that "if you can't measure it, you can't manage it." It's no wonder that leaders of business disciplines such as HR, finance, marketing and sales consistently use metrics, reports and dashboards to assess their business and aid in decision-making.

The software industry is starting to go through a similar profound transformation in our understanding of the use of metrics to better understand the business implications of how software is built. But before this transformation can succeed, we have to work through some pretty big shortcomings in how we determine the types of insights we wish to receive, how we select the analytics necessary to receive those insights and how to capture and manipulate the data to present those analytics.

Exactly 40 years have passed since Fredrick Brooks cautioned that measuring software in terms of man-months was a bad idea. Building software is a complex undertaking with intricate interactions and nuances. This complexity makes the notion of measuring its progress simply in terms of hours of human effort plainly ridiculous. Most everyone in the software industry who has read that book agrees with its premise. Yet so many in the software industry are still measuring software delivery in terms of man-months, FTEs and equivalent cost metrics - and they are as misleading as Brooks predicted.

But if we in the software industry know these metrics are a poor substitute for the "correct" set of management data, why do we continue to use them? There are several factors at play, many of which are underscored by an all-too-common analytical approach: Often, when starting any kind of data reporting project, we start by looking at the available data and figuring out what to do with it. We look at this data to see if we can manipulate it in some way (looking at averages, plotting out trends, etc.) to derive interesting information.

Yet this is precisely the wrong way to go about setting up a metrics program. The more effective way to find actual insights is to first understand what is important to the business and only then determine how to create the analytics required to illuminate those insights. The final step is to figure out how to get your hands on the data.

For example, if all we have in front of us is data about our defects, it should be fairly straightforward to determine things like: How quickly are they being fixed when categorized by severity? Has the velocity of our bug fixing changed over time?

This information is certainly worthwhile, but is it "insightful" information?

Some of the most interesting insights transcend the individual disciplines within software development. For example: What types of features result in the greatest number of customer complaints? What parts of the codebase result in the greatest number of defects? What enhancements provided the greatest increase in revenue?

To create these analytics, an organization has to create a business intelligence (BI) infrastructure for software development. This infrastructure pulls data from various tools used by the extended software development and delivery team, manipulates that data, and renders it in a report or data visualization that allows managers to easily understand what they are seeing.

Over the past year I've had the benefit of meeting face-to-face with IT leaders in over 100 different large organizations and having them take me through how they're approaching the problem of getting their hands on the relevant information. The consistent theme is that to establish meaningful measurement for software delivery, we need the following layers of tools, where each layer is supported by the layer below it:

1) Presentation layer

  • Report generation
  • Interactive visualization
  • Dashboards and wallboards
  • Predictive analytics

2) Metrics layer

  • Business value of software delivery
  • Efficiency of software delivery

3) Data storage layer

  • Historical cross-tool data store
  • Data warehouse or data mart

4) Integration infrastructure layer

  • Normalized stream of data and schema updates from each tool

The presentation layer (1) is not the problem. This is a mature space filled with innovative new offerings, as well as the myriad of hardened enterprise BI tools. What these generic tools lack is any domain understanding of software delivery. This is where the need for innovation on the metrics layer (2) comes in.

Efforts in establishing software delivery metrics have been around as long as software itself, but given the vendor activity around them and the advances being made on lifecycle automation and DevOps, I predict that we are about to go through an important round of innovation on this front. Combining software delivery metrics with business value metrics is an even bigger opportunity, and one where the industry has barely scratched the surface.

Some forward-thinking IT leaders are already correlating business metrics, such as sales and marketing data, with some basic software delivery measures. While a lot of innovation is left on this front, it's clear that the way that the data is manifested in the storage layer (3) must support both the business and the software delivery metrics.

Thanks to the huge investment that vendors and VCs are making in big data technology, the data storage layer (3) has a breadth of great commercial and open source options to choose from, Of course, the one that's most appropriate depends on the existing data warehouse/mart investment that's in place, as well as the kind of metrics that the organization is after. For example, efficiency trend metrics can lend themselves best to a time-series based MongoDB, while a traditional relational database can suffice for compliance reports.

Many of the leaders I've spoken with have already attempted to create end-to-end software lifecycle analytics, and they found that one of the biggest impediments to creating meaningful lifecycle metrics is the ability to "simply" get the data from the various tools being used across the entire software development and delivery, and put them in the data storage layer (3) for manipulation by the layers above.

This is the work of the integration infrastructure layer (4). In the past, this may have been achieved by ETL processes, with data being extracted directly from the database schemas of the individual tools. But that approach has not only proven brittle, but only provides data in batch processing intervals, not in real time. A more robust approach is to access the data through each tool's API.  Unfortunately, each vendor has its massive API set and highly customizable schemas; and process models and standards efforts, while important, are years away from sufficiently broad adoption.

This integration infrastructure layer has to provide the ability to connect to the various tools' data repositories, detect when changes are being made (to provide real-time data) and update the data storage layer. The data storage layer must reflect a unified view of all the artifacts across the SDLC, independent of the tools that originally created the data.

With this infrastructure in place, IT leaders will have access to a new set of data. They will have moved from a very narrow set of data that is captured, manipulated and reported on within the confines of individual tools, to data that is not only unified across the entire SDLC, but that can also span business units outside of software development.

When supplied with this kind of rich data set, the industry can bring together business leaders, software delivery thought-leaders and data scientists to finally go beyond the tactical software development reports of yesterday and move toward the cross-discipline business insights that are necessary to spark true software-based business innovation.

More Stories By Mik Kersten

Dr. Kersten is the CEO of Tasktop Technologies, creator and leader of the Eclipse Mylyn open source project, and inventor of the task-focused interface. His goal is to create the collaborative infrastructure to connect knowledge workers in the new world of software delivery. At Tasktop, Mik drives Tasktop’s strategic direction, key partnerships, and culture of customer-focused innovation. Prior to Tasktop, Mik launched a series of open source tools that changed the way software developers collaborate. As a research scientist at Xerox PARC, he created the first aspect-oriented development tools for AspectJ. He then created the task-focused interface during his PhD thesis and validated it with the release of Mylyn, now downloaded 2 million times per month. Building on the success of Mylyn, he created the Tasktop Dev and Sync product lines.

Mik's ideas on Application Lifecycle Management (ALM) and focus on individual knowledge worker needs make him a popular keynote speaker; he has been recognized with awards such as the JavaOne Rock Star and the IBM developerWorks Java top 10 writers of the decade. Mik's entrepreneurial contributions have been acknowledged by the 2012 Business in Vancouver 40 under 40, and as a World Technology Awards finalist in the IT Software category. Building on his contributions as one of the most prolific committers to Eclipse, he serves on the Eclipse Foundation's Board of Directors and web service standards bodies.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.