(This piece — written in collaboration with Bhushan Nigale — is the fifth in a seriesthat explores the evolution of the software technology stack and software development methodologies in the last two decades.In this instalment Bhushan and I examine the interplay between the old and the new worlds, and also look at how the stack and methodologies play togetherin this evolution)
The first four articles in this series examined the transition from the traditional stack and Waterfall methodology (common two decades ago) to the New Stack we see today in cloud-native products and the Agile/LEAN methodologies common in software development today. Those articles looked at the drivers (that led to the changes), and the impact (of these changes). Some of the key challenges the new stack and methodologies brought in were also discussed.
Given all this, it’s fair to ask: where does this leave the traditional stack or the Waterfall methodology? Where are they used (or relevant) even today? Do they have a role to play in future? How do these co-exist with the new stack and methodologies?
This article explores the interplay between old and new, and also how the stack and methodologies play together.
The traditional stack today
The traditional stack, dominant two decades ago, is still widely in use today. It figures mostly in the enterprise software products built around the 1990s and deployed ‘on-premise’. Some of these products have been rewritten for the cloud, some others have followed the ‘lift and shift’ path to the cloud, but a majority — close to 60 percent [1]— remain where they were originally deployed: in the on-premise data centres maintained by the IT departments of enterprises.
These legacy enterprise products — and thus the traditional stack they are based on — can be expected to stay operational for decades. The reasons for this are many.
Firstly, the large amount of investment (both in hardware and software) that has gone into these systems results in a lot of inertia. Having invested so much into these systems, the natural inclination is to keep them running for a long time.
Next there’s the tricky matter of switching costs — costs that include not just building or buying new software, but also migrations costs, end-user training costs, etc — that need to be justified: unless there’s a compelling business reason, such transformation projects do not get the budget.
Then there’s the question of skill. Enterprise IT departments are experienced in maintaining and operating the traditional stack, but they lack skills the new stack demands. Unless there’s a demographic change — which can take decades — this factor will continue to play a role in decisions involving a move to a new architecture.
Ultimately, it is a matter of business priority. These enterprise products also are typically ‘systems-of-record’, which do not face the same kind of demands — to change fast or scale flexibly — as the ‘systems of engagement’ (or, in the B2C world, any consumer facing apps) do. And while they may be mission critical, these transactional systems are often not seen as strategic: so why touch them if most of the innovation is anyway happening elsewhere? As long as the data from these systems of record can be accessed quickly and used (for AI related capabilities, for instance), there’s little business need to rebuild these solutions on the new stack.
So these legacy products built on the traditional stack will continue to be in use in the foreseeable future. One important consequence of this is the rise of Robotic Process Automation (RPA) tools in the software industry [2]. These tools make up for the deficiencies in legacy software (like missing APIs, or fragmented toolsets) and add a layer that further removes the need to modernize legacy solution landscapes among enterprise customers.
(This piece — written by Bhushan Nigale — is the fourth in a seriesthat explores the evolution of the software technology stack and software development methodologies in the last two decades.In this instalment Bhushan examines the consequences of the widespread adoption of Agile and Lean.)
In article two of the series I presented the various forces that have led to the evolution of software development practices from the Waterfall model to Lean and Agile. We saw a variety of proximate causes that caused this evolution: the increasing role software plays in all spheres of our life, the massive changes in software architecture and the mainstreaming of Open Source software, the increasing consumerization of IT and the changing demographics of the software industry.
This article examines the consequences of these changes. Mainly, it answers the question: did Agile and Lean hold on to their promise? When a species evolves to adapt to its new environment, manifest changes appear. Can we discern such changes in the industry, for instances in workplaces and the roles played by practitioners? If we live in a post-Waterfall world, what are the obvious signposts that the changes have ushered?
In what follows, I provide an overview of how other industries have begun to adopt Agile, to what extent the hierarchies still matter, the roles of teams over individuals, and the rising importance of roles such as the Product Manager.
Agile delivers
The agile movement that arose from the Agile Manifesto is now widespread to the extent that software development organizations consider it the de facto style for delivering innovation at scale. Software development and implementation projects are risky, failure-plagued endeavors: while statistics widely differ, reliable studies (such as the Standish Group’s Annual CHAOS report) report as high as two-thirds of technology projects ending in partial or total failure.
With its emphasis on involving end-users as early as possible and then collaborating with them, smaller release cycles and a clear articulation of user requirements, Agile addresses the most crucial reasons for these failures: users become stakeholders, vested in the success of the project, rather than just using the project ‘thrown over the fence’ to them (e.g. by IT departments). The transparency in progress improves trust and the health of interdepartmental relationships – truth is the best disinfectant.
Hierarchies matter less
A counterintuitive, but welcome, change has been the gradual flattening of organizational hierarchies. While Lean originated in manufacturing companies, traditionally hierarchical with a ‘command-and-control’ operational model, its fundamental principle of putting customer value first meant that employees need to be more empowered to ensure this principle lives in practice. Thus, a product owner several levels below the unit head, takes significant decisions and takes accountability in the success of the product: brand new announcements in products and cloud services are increasingly made by Product Managers and not development departmental heads.
(This piece is the third in a seriesthat explores the evolution of the software technology stack and software development methodologies in the last two decades.It examines the first and second order effects of the new stack and explores the challenges this stack has given rise to.)
The first article in this series began with an outline of the “traditional” technology stack that was common in the early 2000s. It then examined how the internet, mobile, and cloud revolutions exposed the limitations of this stack, deficiencies that led to the new stack we see today. The article outlined the key characteristics of the new stack, and we also saw how these traits solved problems this traditional stack could not.
The stack today looks very different from the one we saw two decades ago. It consists of small, loosely-coupled (and mostly open-source) pieces that are distributed over a network and communicate using APIs. These aspects — the breakdown of the stack into smaller components, the ubiquity of APIs, the widespread adoption of open-source, and the distributed architecture — have had a huge impact in the last decade or so. This article will look at these consequences, both positive and negative.
First-order effects
Perhaps the most important consequences (of this breakdown of the traditional stack to the new one) have been the creation of a software supply chain and an API economy.
With the traditional stack, it was common for vendors to build most parts of the stack themselves. Vertical integration was seen as a competitive advantage, and software companies like Oracle even acquired hardware vendors (like Sun Microsystems) to offer the full stack, from infrastructure to user interface. And it was common for enterprise consumers to go to a small set of vendors to meet their software needs.
What we see today — thanks to the new stack that leans towards single-purpose solutions — is a best-of-breed approach for constructing the stack. Vendors (or open-source projects) offer specialised solutions or frameworks across the stack and across different stages of the software lifecycle [1]. The entire supply chain of software — from planning, development, delivery, to operations — can now be composed of tools from niche vendors or open-source offerings [2]. This trend highlights the growing maturity of the software industry: we’ve gone from a model where most parts of the solution come from one vendor (or a few vendors) to a model where a rich ecosystem of vendors is powering the entire software supply chain.
(This piece — written by Bhushan Nigale — is the second in a seriesthat explores the evolution of the software technology stack and software development methodologies in the last two decades.It examines the journey from the Waterfall model to Agile and LEAN, outlining the main factors that drove this change.)
A benefit of spending over two decades in an industry is that one develops a perspective to separate hype from substance. This viewpoint is especially useful in an industry like software, where minor feature increments are hailed as innovation, and press releases, blogs and Tweets tout routine upgrades as revolutionary. After having lived through several such hype cycles that have a high probability of going bust, one learns to exercise caution, and appreciate genuine path-breaking innovations (the first article in this series — written by Manohar Sreekanth — lists the technology changes that have stayed).
Innovation in software development methodologies is even harder to achieve and sustain. A paradigm shift is rare – at least in the original sense of the term (Thomas Kuhn used it to define a fundamental change in basic experimental practices in a scientific discipline). Inertia is difficult to overcome, especially if established methodologies seem to be getting the job done.
I’ve been privileged to witness and experience firsthand such a paradigm shift in software development, namely the shift from Waterfall to Agile methodology. The shift has been so complete that new entrants to the industry have little – if at all – any familiarity with the older methodologies. Agile is their new default mode now.
Examining and reviewing this shift is both useful and important, because the promises of any established order need to be constantly reexamined as flaws and digressions inevitably creep in. Over time, unless tended carefully, practices tend to return to older routines — regression towards the mean is an iron-clad statistical law. Understanding the older practices and the change drivers that led to their evolution help us better appreciate the advances and detect costly deviations. An appreciation of the historical developments helps practitioners not only to address flaws, but also iterate over the methodology to adapt to the changing operational environments.
A variety of forces have led to this evolution from Waterfall to Agile: the increasing role software plays in all spheres of our life, the massive changes in software architecture and the mainstreaming of Open Source software, the increasing consumerization of IT and the changing demographics of the software industry. We examine these factors in this article, and treat the consequences of these changes in a subsequent one.
From Waterfall to Agile
The previous article in this series traced the fundamental change in the technology stack used to build software applications. A parallel evolution, in the methodology of developing software, has accompanied these mammoth technological shifts.
When I entered the industry in the late 1990s, Waterfall had none of the negative labels one finds associated with it today. Terms such as ‘Software Requirements Document’ and ‘Handover to Maintenance’ were ubiquitous and carried a certain respect – passing a Quality Assurance Gate was a big milestone that invited celebration. The software development process flowed from a high perch (hence ‘Waterfall’) of analysis and design to the plains of testing and release, where software was then finally delivered to customers.
But cracks had already started to appear. Disenchantment was rising, both with long delivery cycles and the obsession with adherence to the strict development processes. The internet – which broke the traditional stack as we saw in the previous article – was triggering foundational changes in which software was consumed, and these consumption-driven pressures were now being transmitted to how software was being built. Consumers wanted their software delivered to them faster and better, even as it began to occupy an increasingly central part in their lives.
(This piece is the first in a series — written in collaboration with Bhushan Nigale — that explores the evolution of the software technology stack and software development methodologies in the last two decades. It examines why the “traditional” stack could not meet the needs of a new class of applications that began to emerge in the late nineties, and outlines the characteristics of the “new” stack we see today.)
One of the privileges of working in the same industry for a couple of decades is that you can look back and reflect upon the changes you’ve seen there. But this isn’t something that comes easily to us. Why are things the way they are in software? is a question we don’t ponder enough. For youngsters entering the industry, current challenges may seem more relevant to study than past trials. And for veterans who’ve seen it all, the present carries a cloak of inevitability that makes looking at history seem like an academic exercise.
But it doesn’t have to be that way. Understanding the forces that led to the evolution in software we’ve seen in these last two decades can help us make better decisions today. And understanding the consequences of these changes can help us take the long view and shape things going forward. To see how, let’s begin with the technology stack that was common two decades ago.
The Traditional stack
When I started working in the enterprise software industry back in the late nineties, the software we built was deployed on large physical servers that were located ‘on-premise’. The application was a monolith, and it used an SQL-based relational database. The fat-client user-interface ran on PCs or laptops. Most of this stack was built on proprietary software. Put simply, this is how the stack looked like:
This was the state of the client-server computing model used in business applications in the nineties. At SAP, where I worked, the client was based on a proprietary framework called SAPGui; the application server was another proprietary piece of software that enabled thousands of users working in parallel; the database layer was open (you could use options like Microsoft SQL server, Oracle DB, or IBM DB2, among others); and the infrastructure beneath was an expensive server (like IBM AS/400 or Sun SPARC) that sat in the customer’s data center.
This architecture was optimized for the needs of business applications that evolved in the nineties, and such a stack — from SAP or other vendors in that era — is still used in a majority of on-premise installations. But in the second half of the nineties a different story was unfolding elsewhere.
Internet-based applications were gaining traction as the dot-com era blossomed, fell dramatically, then picked up again (no longer bearing the ‘dot com’ label). And for those applications, the traditional stack proved woefully inadequate. The reasons included cost, availability, performance, flexibility, reliability, and speed: key demands placed by the new types of applications being built on the internet.
The internet breaks the traditional stack
The internet ushered in a scale that was unimaginable in on-premise enterprise software. Websites like Google, eBay, and Amazon had to serve a large number of concurrent users and cope with wide variations in demand. With the traditional stack, adding more capacity to an existing server soon reached its limits, and adding new servers was both expensive and time-consuming. In the new business context, infrastructure costs could no longer grow linearly with user growth: applications needed an architecture that enabled close to zero marginal cost of adding a new user; the old way of adding expensive hardware was unviable.
The internet also placed a much higher demand on availability: these applications needed to be “always on”. Initially a requirement mainly with B2C applications, availability caught up with the B2B world as consumerization of IT gained speed. Soon ‘continuous availability’ turned into a competitive differentiator for businesses that moved (partially or fully) to the web. Five nines or six nines (99.9999 % availability) became the benchmarks, and a new architecture was needed to achieve this level of availability without driving up costs. Again, the old way of installing expensive servers for failover was simply too expensive and inefficient.
The need to scale applications better also arose due to performance expectations from internet-based (and later mobile) applications. E-commerce applications also saw peak usage in some periods (like Christmas or Black Friday), and others had ad hoc expectations (like planning an ad campaign for a few weeks). Meeting this unpredictable demand needed a different level of flexibility in resource allocation, something that the traditional stack — and hardware-based methods — could simply not offer.
Businesses that moved to the internet also had to evolve much faster than the systems of record (built on the traditional stack) that had dominated the previous era of business applications. Parts of the application that needed more frequent changes had to be deployed independently — and at a different pace — from other slow-moving parts. This was not possible with the monolithic applications built on the traditional stack: it required a new architecture that allowed teams to build and deploy smaller pieces at a faster pace. (It wasn’t just the technology stack that was inadequate — the traditional waterfall model could also not cope with this pace of change and the flexibility this new world demanded. This parallel evolution of development practices will be discussed in a separate article.)
In his best-selling book Outliers, Malcolm Gladwell popularised the notion of the ‘10,000 hour rule’. Drawing on research done by a Swedish psychologist, Gladwell constructed a narrative around the idea that mastery of any skill needs roughly 10,000 hours of practice. It was an attractive idea, one that built upon the already well-accepted notion that ‘practice makes perfect’. You now could plan your path to mastery, set up daily goals for deliberate practice that would lead, eventually, to perfection.
This is completely wrong, says David Epstein in his book Range: How Generalists Triumph in a Specialized World. Or at least that’s what the book’s blurb says. The book itself is more nuanced, and suggests that such a rule is true only in a very limited sense.
Epstein divides the world into two types of domains: kind and wicked. Kind domains are simple, one-dimensional fields with clear rules, where patterns repeat and feedback is fast and accurate. Playing the piano, playing chess, or playing golf are some examples of such domains, where deliberate practice can improve performance. Wicked domains, on the other hand, are those where patterns don’t repeat, rules are unclear or incomplete, and feedback is often delayed or inaccurate. Most real-world domains fall under this category. Research, medicine, education, management, parenting: these are complex domains where there’s no well-defined path to mastery. 10,000 hours of deliberate practice do not help here — on the contrary, expertise and specialisation in these domains may even worsen performance in certain contexts.
What matters in these wicked domains, Epstein says, is “range”. How much experience you have in fields unrelated to your work; how much of an outsider you are; how well you can integrate diverse perspectives of experts you work with; how much of a lateral thinker you are; how appropriately you choose to drop standard best practices; how well you can see connections between fields — in short, how much of a generalist you are.
One of the interesting things about developing software in the Business-to-Business (B2B) space is that you often don’t know what users need, even when you think you do. Such blind spots may not be entirely your fault.
Early in my career, I developed a software distribution tool for a CRM solution that ran on the salesperson’s laptop. (This was long before web-based applications became the norm.) To update the software on the laptop, I had assumed the availability of administrator privileges inherited from the logged-in user. But soon after the first version was released, I received worrying news: due to company policy, some customers did not give their salespeople admin privileges on their laptops. These customers hadn’t figured among those we had interviewed, but they were important. We went back to the drawing board and designed a solution that worked transparently to the logged-in user who didn’t have admin rights.
Let’s examine why this happened.
Blind Spots Caused by Poor Understanding
Simulating B2B software environments can be tricky. It can include a network of interdependent B2B applications — a complex landscape. Business processes can span different roles and can run for a long time. There may be company-specific policies that govern software usage. Users may work not in offices but in spaces like a factory floor or an oil rig. To understand a user’s needs you first need to simulate her environment, which is not easy in the B2B space.
As the CRM example showed, this lack of understanding leads to blind spots in our thinking.
B2B software is also complex. Not for its own sake, but simply because the reality it tries to model and automate — those real businesses — is complex. One way to manage that complexity is to break the solution into smaller modules or applications, each of which is designed, developed, and delivered independently by a small team. While such teams can be efficient, they often miss the big picture and don’t quite see how their local module is used in the context of the larger product.
This again results in blind spots. And the impact here goes beyond user-experience.
Experience teaches us that software product development is more of an art than a science, and building great products is more about people than technology. Things may be black or white at the circuit level, but the higher we go in the ladder of abstraction, the more grey they become. Two teams may use the same flavour of Agile but produce very different results. The practices used to build complex software can sometimes seem random, guided more by individual (or team) preferences than solid engineering principles. Development cycle recommendations range from weekly, biweekly, to monthly or weeks. Who do you follow?
Often, it’s context that gets missed. Applying the software development practices without considering the context – specific to your domain/industry, work culture, and your local constraints and goals – is an invitation to failure. For instance, if you’re developing software for a nuclear-power plant, then rigidly placing “working software over comprehensive documentation” (as the Agile manifesto puts it) is perhaps not the best option. In an unstable environment with frequently changing boundary conditions, enforcing the OKR approach will only lead to frustration. And stakeholder management strategies differ widely depending on the work environment and company culture.
The following framework is based on my experience in the last three years. Until very recently I led several product teams to develop cloud-native services that formed parts of the SAP Cloud Platform, an enterprise PaaS solution from SAP*.
But first, here’s some context.
Context
The group was part of a larger ecosystem that included, among others: Platform services being developed in other organisations; Lines of business building apps on the platform; Central architecture and operations teams doing governance and operations; Teams providing in-house tools.
So a complex stakeholder matrix influenced what the group built and how we built it. The group catered to an existing customer base, which means a greenfield approach was not possible: often the group’s services had to fit into an existing toolchain or process these customers used. It also meant it was not possible to independently imagine new services into existence or target new markets – frequent validation with existing customers and partners in the SAP install base is key.
Despite the trappings of a large company, the group operated like a startup. Most services were incubated and shaped for the external market (and not driven based only on internal stakeholder demand). The group was responsible for the success of the products it developed, with the freedom to choose the tools and practices needed to achieve this success.
The Framework
Across the different teams in the group, we found it useful to break down and categorise the product development lifecycle into problem-solving categories. Not phases that follow each other as in a waterfall, but simply different classes of problems we need to solve:
This structure evolved as we gained experience in conceiving and building new services. Once there was a grip on the patterns (the common problems teams were solving, the categories they fell into, the roles working on them) it was natural to put them together into a single, larger framework.
The framework categories are not phases in the product development lifecycle (even though some may appear so). The boundaries are fluid, and an overlap of activities across categories is common. Getting stakeholder buy-in may need work from design and architecture; strategy needs to be validated with customers using examples or prototypes from design. Such specifics depend on the nature of the problem space and the solution being considered.
And clearly, iteration may be needed. For instance, failing to get buy-in the first time could mean returning to the strategy question and tweaking the hypothesis before getting it validated again with customers. Issues in delivery may necessitate revisiting the question of our branching model in the CI/CD environment.
Framework Characteristics and Benefits
Problem-solving Focus
The framework highlights the nature of problems the group solves through the lifecycle, from conception to maintenance. Structuring and phrasing it in this way avoids thinking only of deliverables or features. Instead, the idea is to regularly ask: What problem are we trying to solve here? Has it been solved by someone else?
Framing activities as “problems to solve” gives everyone in the team a better handle on what they are trying to achieve. Without this focus, it’s easy to get lost in the details of achieving a “key result” (to use OKR terminology) without really understanding – and perhaps questioning – its purpose.
For instance, reusing a central service can be managed better when the task (say, set up a CI/CD pipeline) is framed as a concrete problem to be solved. Understanding that problem may reveal that the central service is unsuited for the team’s purpose (perhaps due to the technology stack used). On the other hand, framing it as just a task for the Scrum team runs the risk of the assigned team member merely following a routine to complete the task. A simple shift in perspective – towards looking at the problem – can lead to radically different results.
The focus on problem-solving also reduces waste. When asked to create a report for someone in senior management (not uncommon in a large company), the group can instantly see that this does not really solve any problem. Understanding this helps to spend less time on such activities.
This change in mindset is best approached holistically, across the product development lifecycle. And following the framework is a way to introduce this culture within the teams.
Product and Engineering Together
The framework spans topics that come typically under Product Management and Engineering: there is no separation between the two.
There is also no strict mapping of responsibilities (of Product and Engineering) to problem areas – both have a role to play across the lifecycle. For instance, architects can play an active role in strategy definition (they are sometimes the source of ideas that lead to new services), and product managers can have a say in delivery architecture (to ensure it meets the business goals defined).
In his book Inspired: How to create tech products customers love, Marty Cagan refers to this separation between product and engineering as a missed opportunity: “The little secret in product is that engineers are typically the best single source of innovation; yet, they are not even invited to this party…”
Achieving this synergy is easier with both Product and Engineering in the same organisation, under the same head. In this case, Product and Engineering are closely knit – even sitting in the same rooms, part of the same meetings – which leads to better communication and understanding across the two roles. Defining the right pricing metrics, for instance, needs close collaboration between the product manager, product owner, and architect, and this is not a one-time activity: several iterations are needed to get them right. Treating this problem like a project spanning organisational boundaries would be suboptimal in comparison.
Even when Product and Engineering are in different organisations, a common framework can act as a scaffolding holding together the two silos, thus enabling better collaboration.
Big-picture View
For anyone responsible for a medium-sized portfolio of products/services (say, eight “teams of 10”), there are dozens of tasks seeking your attention at any point in time — how do you judge what to take up next, what to parallelise, what to postpone?
The framework shows a way out, by presenting a big-picture view of the product development lifecycle. Across a portfolio of services, it can also be used as an assessment tool to see what problem areas are still open and where good progress has been made. The group has had cases where buy-in from a stakeholder was excellent for one service but other services were struggling – the assessment brought this to the open, which helped the teams talk to each other and rethink their approach toward the stakeholder.
As someone leading several product and engineering teams building services (each in different lifecycle stages), the framework has helped me to get a handle on the problems I needed to focus on at any given moment. The big-picture view is essential for this; without it, I could easily get lost in the details.
I also used it to conduct business reviews of the services in my area. Each review was structured along these categories, and the teams came up with a summary (for the service they were building) of each problem area. Presented this way, it was easy to see which team was getting something right and who was struggling with a problem area. The resulting transparency across the teams created many opportunities to learn from one another.
The big-picture view benefits all roles in the team, not just the lead. Developers gain insight into the business value of the code they are writing. This alignment with the business goals is key for a well-rounded product (and is easy to miss unless there is an explicit focus on it). The developers also get to see what else — beyond coding — is needed to get their product or service to the market. We sometimes asked developers to join our customer interview sessions: teams with this practice showed a better understanding of the product/service they were building.
Used in this manner, I’ve found that the framework is a highly effective communication tool.
Summary
The framework offers a big-picture view of the activities broadly needed to conceive a new product, develop it, and take it to the market. It adopts a problem-solving approach to structure these activities, and includes both Product and Engineering domains.
Approaching the software lifecycle as a set of problems to solve radically changes the nature of conversations in your teams. The rigour this introduces leads to better decisions, an improved grasp on managing priorities, and a change in mindset that leads to less waste. Viewing both the product management and engineering as parts of a single frame breaks down barriers that often exist between them. You begin to see how both roles can contribute throughout the lifecycle, resulting in a better product.
The framework can be used as a blueprint while starting a journey to build a new product, but one doesn’t have to follow it religiously to benefit from it. Think of it as a guideline, something to internalise and use as a checklist. Use it also to structure the portfolio of products you are building. The framework is an effective communication tool, a common way to present all products/services in your area, and align all teams with the big-picture and the business goals. Use this structure also to assess, across different problem areas, the relative strengths of your product portfolio.
And depending on your context, tweak it.
* While this piece refers to my recent experiences at SAP, the views expressed here are mine.