One of the problems I have found is that we don't talk the same language when co...

pydry · on Sept 16, 2021

This is a common theme but I don't believe it's true. It mirrors all of those twee blog posts that come up with Yet Another Metaphor for technical debt as if people with MBAs would then just "get it".

It's not really about finding the right metaphor it's about lacking a common, shared unit of account.

Risks can't really be factored into decision-making unless you can measure them. Theyre not risks otherwise theyre just black swans waiting to happen (or not).

Technical debt and code coverage "as risks" cant be factored in either. Instead of trying to cargo cult the way "the business" talks or coming up with yet another metaphor we should be coming up with better ways to measure these things so that they can be plugged into an excel spreadsheet.

This is done incredibly badly right now. Most measurable code metrics which proxy things we care about are downright terrible at proxying them (e.g. test coverage). In place of working metrics most businesses (in my experience at least) rely on guesswork and trust in high level executives.

35fbe7d3d5b9 · on Sept 16, 2021

> Technical debt and code coverage "as risks" cant be factored in either.

They can be measured. Lots of companies don't, which is mind-boggling.

When a release goes sideways and has to be rolled back, you figure out why. Ah, you launched a feature that revealed an intersection of edge cases in your testing? That's n developers * m hours * p dollars of blended dev salary down the drain.

When your feature delivery slows to a crawl over time, you dig in – your programmers aren't getting worse over time... are they? No, you find that a ticket that took your average developer x hours to deliver at the beginning of your development now takes 2x or 3x. Make a value stream map and you'll find out that your test suite has become so sluggish that your developers can't iterate quickly, that QA now measures regression time in days not hours, and as a result your developers are taking on less work to compensate. X dev hours * Y blended rate in ongoing waste, plus factor in the value of missed sales because of missed features if you want to really put a point on it.

> It's not really about finding the right metaphor it's about lacking a common, shared unit of account.

Name the local currency you get paid in – dollars, Euros, pesos. That is the shared unit of account. If you don't care about it, start walking up the org chart. You won't have to go far before you realize that's what actually matters, and that engineers who can translate technical risks and inefficiencies in their world to dollar values are highly valued.

thrower123 · on Sept 16, 2021

People get mad when you start counting up how much salary the hours of meetings about why you aren't moving faster chews up.

That daily status meeting that pulls in twelve people for an hour to hem and haw about nothing? That costs $600 every day. $30k a year.

SketchySeaBeast · on Sept 16, 2021

Only $600 for a dozen people? That's only about three people based upon the billable rates I'm familiar with.

pydry · on Sept 16, 2021

>Name the local currency you get paid in – dollars, Euros, pesos. That is the shared unit of account.

Good luck trying to accurately measure technical debt in dollars.

35fbe7d3d5b9 · on Sept 16, 2021

Accurately is the hard part, but you don't have to be perfect here – just as accurate as next year's sales forecast is.

"I can't outrun a bear, I just have to outrun you" ;)

SketchySeaBeast · on Sept 16, 2021

To take the metaphor, mangle it, and run with it - The problem is that technical debt isn't being chased by a bear, it's walking through the woods in bear country. It only becomes truly quantifiable when the bear charges you. Until then it's a "could be a problem". We probably don't even know the exact details of what sort of bear is going to come maul us, so it's hard to say "if we don't stop this a brown bear is going to come out and get us".

35fbe7d3d5b9 · on Sept 16, 2021

I don't disagree, which is why my original post talked about pricing technical debt retrospectively. You don't know when the bear will charge tomorrow, but past bear charges can be highly instructive if you learn from them.

My prod ETL process is an unloved mess, and we've had enough maulings occur to learn from it:

* I learned the problem – data models drift in the production app and changes aren't reflected / properly tested in the ETL or the analytics environment. And we don't have enough monitoring in place to catch the problem when it happens in the wild.

* I learned the cost of inaction – my analytics environment is responsible for $10mm of CARR, and I know the impact to customers when it goes down. Heck, I know how much customer credit has been given out due to SLA breaches, so there's a quantifiable price today.

* I learned the price to fix it - we've estimated the effort and run cost of the new solution.

Now I've got something that I can work with: customer acquisition says they've got $4m in the near-term pipeline? OK, great, but if we onboard them with a broken system we risk spending $+x in customer credits and risk damaging our reputation. And since it costs $y to fix it, and $x > $y, let's fix it.

I have plenty of other unloved systems that are in bear country but haven't mauled me (yet), but even then you can start thinking about risks. One system is small but critical for multiple products, so my exposure is "hey every single customer is getting service credits" – expensive enough to force the monitoring/refactoring work that it always needed. One of them serves logins for ~7k users but only for one customer so my financial exposure is bounded. We don't worry about bear spray when walking around there ;)

majormajor · on Sept 16, 2021

I've never really seen tech debt cause an immediate, hugely-costly single event where you could say "see, that was the risk we were taking."

I've seen it frequently slow feature development down, though.

But... I've also seen a lot of rewrites fail to improve feature development speed.

So until we, as a discipline, can quantify and predict development speed w.r.t. shitty vs good code, it's going to be a tough conversation that'll rely on persuasion and gut estimates.

We normally can't even predict how expensive (time-consuming) doing that rewrite that we want to do would be! Let alone the benefit!

herodoturtle · on Sept 16, 2021

I'd like to second this line of thinking.

I've built a software business over the last 15 years using this exact communications theme, and whilst my little business is a sample size of 1, we've certainly found that framing conversations with customers in terms of "risk" has kept us all on the same page.

I'd also add that by adopting this communication style, one can then look at opportunities in a new light as well. On that note, I was lucky to read a book called "IT Risk" (by Westerman & Hunter, HBS Press [0]) back when I first started the company, and it gave me an interesting perspective on risk.

In a nutshell, once you identify and minimise/eliminate all the usual risks (many of which you identified in your list), you can then reorient your business in such as way so as to start actively taking risks which stand to improve your overall offering.

This in turn allows you to build a strategic moat of sorts, because whilst your competitors are still scrambling to address the usual risks, you're actively taking on opportunistic risks which at times reap tremendous rewards.

[0] https://www.amazon.com/Risk-Turning-Business-Competitive-Adv...

nanis · on Sept 16, 2021

If you are lucky enough to work with people who are willing to explain this and take the information in objectively, that's great. Sometimes, I have been that lucky person.

On the other hand, in not a trivial number of cases, you'll be working with people who get all the upside when things go right and the blame the tech side successfully when they don't. Their interests do not align with the interests of the business either but since the business side consists of their bros, they all instinctively align on that.

So, you keep talking about "risk", that gets portrayed to upstairs as "I am doing my best, but so and so here is throwing technical minutiae at me which is slowing us down."

They are used to being graded on the curve where your position gets better if everyone else does worse and taking advantage of that system.

Among the many reasons I never graded on a curve, but you can't do that with upper management in business.

I say this as a person who believes business cost/benefit calculations trump everything else. However, decisions must be made by people who understand the tradeoffs and are accountable.

makeitdouble · on Sept 16, 2021

I am not versed enough in car things, but it seems to me this would be something maintenance shops nailed down decades ago. How would they convey that your car is lacking maintenance even if you can still drive it to work ?

Perhaps we should go with "needs repair" or something like that ?

"Risk" feels like insurance territory (which goes along with "there is always some risk"), and a lot of people beautify the notion of taking risks to get higher gains.

Cthulhu_ · on Sept 16, 2021

The risk for a shoddy car is that you end up killing someone or get written up and charged with a violation. The risk for shoddy software (in most cases) is that you have an outage and suffer some financial or reputation damage, but that won't put you out of business.

There's laws against bad cars, there's no laws against bad software.

squiggleblaz · on Sept 16, 2021

> There's laws against bad cars, there's no laws against bad software.

There's often privacy/data protection obligations, but they seem to be impossibly difficult to get the courts to pay attention to. If the average business owner would find themselves in legal shit every time an external party got access to their data (i.e. just being the victim of a ransomware attack puts you at risk of losing your home), they would probably pay more attention.

giansegato · on Sept 16, 2021

Risk-taking is very cultural dependent, in some contexts people would avoid taking risks as their default option.

That said, I feel that car shops just work because, well, it's the law, especially in some countries.

In the end "needs repair" is the same as saying "mitigating accident risk".

janstice · on Sept 17, 2021

More than needs repair - I think that a pretty good analogy for tech debt is deferred maintenance - if you don't change your oil or your tires the chances of expensive/bad problems increases. If you've got dents in the body you will need to address these before a respray.

And if you have a 15 year car with a clapped out motor, you are not going to achieve modern safety and fuel efficiency by swapping the worn-out motor with a NOS replacement, so there's a point where investing money in the old solution isn't going to move you into the future you need.

OJFord · on Sept 16, 2021

And then we can scathingly label the most hated parts 'beyond economical repair'!

JPKab · on Sept 16, 2021

I don't disagree, but at my last company, I was always very explicit about these things in business terms.

The real enemy I ran into was the SVPs desire to please the CEO who wanted to please the board who wanted to please the investment PR racket and make sure we could tell Gartner that X feature is ready by Y date to ensure we were included in their Magic Quadrant. The answer to the bosses had to be "yes it will be released by Y date". Saw that pattern repeated, realized "Agile" was just a word for waterfall, quit for a startup.

pdkl95 · on Sept 16, 2021

"How a plan becomes policy"

http://web.mnstate.edu/alm/humor/ThePlan.htm

This poem is my favorite description of this pattern, because it focuses on the the bad communication that creates the problem. Regardless of the intent of the people involved, gradually filtering out important information at each level as people try to please their superior guarantees a GIGO mess for the that the people at the top. As the poem says, "this ... is how shit happens",

Adopting something like the airline industry's "no-blame" culture that focuses on getting accurate reports by explicitly not focusing on blame might help avoid the natural tendency to eventually fall into this pattern.

35fbe7d3d5b9 · on Sept 16, 2021

http://www.art.net/~hopkins/Don/unix-haters/tirix/embarrassi...

> I wrote a note in sgi.bad-attitude about the "optimist effect", which I believe is mostly true. In condensed form:

> Optimists tend to be promoted, so the higher up in the organization you are, the more optimistic you tend to be. If one manager says "I can do that in 4 months", and another only promises it in 6 months, the 4 month guy gets the job. When the software is 4 months late, the overall system complexity makes it easy to assign blame elsewhere, so there's no way to judge mis-management when it's time for promotions.

> To look good to their boss, most people tend to put a positive spin on their reports. With many levels of management and increasing optimism all the way up, the information reaching the VPs is very filtered, and always filtered positively.

blurbleblurble · on Sept 16, 2021

I agree with you, and I think it's just as frustrating when the decision makers don't document and communicate risk "down the chain", instead packaging up tasks for the implementers and expecting them to have the same sense of priorities, or to rediscover all the nuances of their design decisions during development, etc

bgro · on Sept 19, 2021

Businesses frequently don't understand the need to fix tech debt / testing / maintenance / security. They just think it's just code for the engineers wanting to "sit around and do nothing" (this is the actual phrasing I was told) because it doesn't add new features they can sell and "contributes nothing to the company."

danybittel · on Sept 16, 2021

You forgot the most important one: the risk of running out of money.

Isamu · on Sept 16, 2021

Middle and upper management are mostly concerned about risk, because the environment they are in rewards being “done” more than trying something that could have a bigger payoff.

That is, the downside of accepting risk tends to outweigh the upside of achieving a goal at the cost of running over schedule. In that environment.

b3morales · on Sept 16, 2021

Very much this; not least because, having organizational power, they can manipulate the situation to evade the downside. As discussed in the classic Gervais Principle essay: https://www.ribbonfarm.com/2011/10/14/the-gervais-principle-...

pjmlp · on Sept 16, 2021

There is another thread on the front page about every engineer having to try out consulting.

This is one of the ways to learn how to use language that business gets the point across.

djmips · on Sept 16, 2021

Didn't quite parse your second paragraph. Missing a few words?

pjmlp · on Sept 16, 2021

Rephrasing it, with consulting one is mostly on the frontline and has to deal with all areas, so one grows into using a language to express the technical issues in ways that get the buy-in from business, specially if it is a one person team size consulting gig.