Alex Nauda
@alexnauda
Followers
309
Following
5K
Media
14
Statuses
85
quality / time / money / SAFETY / MAINTAINABILITY / PREDICTABILITY pick two. CTO @nobl9inc
Boston, MA
Joined April 2009
Finding 2: Anything involving data validation in testing is going to require special attention
0
0
0
This kind of discovery happens ALL THE TIME in legacy codebases. Is it possible to do old work in them? Sure. But it's super slow -- less than half the speed of work where your continuity of knowledge is intact in the team. It's also a separate specialty requiring specific skill.
0
0
1
Turns out that this interior wall once had a door in it, closed over by a previous owner. THIS WAS A LEGACY WALL. And what I couldn't see was the header (the board above the top of the door) blocking any wire I might fish through. I have no outlet there to this day, just a plate.
1
0
1
Sticking with the house repair metaphor... I once started a small DIY wiring project to install a new outlet, which involved running romex wire down an interior wall from the attic. I cut a hole for the receptacle, shined a bright light, and looked for it from the attic. Nothing.
1
0
1
Why is legacy code a problem? It's like working on an old house where you have a limited view behind the walls. It's not like gutting a house, which is more like new work. In old work, you make discoveries as you go, and you constantly have to redefine the scope of work.
1
0
1
We have a thing in software systems that we call "legacy" that can be hard to define. What is legacy code? My definition is, it's code that was written by someone else who is no longer available to help plan and execute projects.
1
0
2
Tech debt as Taylor Swift songs, a thread. 🙃🧵
@jeanqasaur If you post about Tech debt as Taylor Swift dresses or tech debt as a dating strategy you’ll be at full posting power
5
18
79
To start shedding unattainable software standards, let's: 🛑 Stop thinking of software as homogeneously represented by a small number of unrepresentative companies 🗯 Start being more honest about "real software process" 🛠Demand more solutions to the real problems!! end/
46
130
1K
Of course it's all of these, and more. How can you detect these at scale? How can you detect that a system is suffering under changing conditions of growth and change? Or simply which services in your portfolio are janky from the start? At scale.
0
0
3
Did this SLO "predict" the outage? Well, not directly. Primary contributing factor to the outage: a hard storage ceiling SLO that failed: batch run duration These are not related. BUT WAIT. ARE THEY??
2
0
3
In preparation for our incident retrospective, I'm reviewing our SLOs for this service. To my surprise, we could have seen this coming. The SLO for this service had exhausted its error budget 6 days BEFORE the outage. We *should* have seen this coming.
1
1
6
So a few days ago, we (Nobl9) had this outage. Quick recap: We offer hourly batch data export of SLO time series data, in delimited formats, to cloud object storage, and ultimately to a data warehouse (and we offer a Snowflake integration downstream of that). It broke.
1
0
2