Coming from an old school engineering company, this article frankly reads as a little insane to me. I assumed "Code Yellow" was a catastrophe on par with a tornado hitting a factory or a massive security breach. Instead the examples are not hitting artificial growth metrics and needing to launch advertising. It's bad enough to blow up teams planned work but this is what you demand employees (not founders mind you) sacrifice their personal life for (i.e. get paid less per hour and not see their families)? The author not only acknowledges the time he is stealing from employees but that stressing them out is the point. If you're 8 years into a company and lurching from (artificial) crisis to crisis to "sweat the teams" something is seriously wrong.
I work in public health so our outages are critical to safe treatment of our patients.
We have the concept of a Major Incident/Major Event (because the term Code Yellow is already co-opted to mean any administrative fault across a hospital which might impact the flow of patients).
These are all-hands-on-deck moments. They may be called by any member of staff who discovers an event which will impact our service, though it will be ratified by a Major Incident Manager (MIM). While the event is underway the MIM is God; except when it affects staff welfare. If a member of staff says that they cannot attend a war room for whatever reason, then the MIM will move onto the next person up the chain, even calling in directors if they feel it is required.
What is being described above is tech-bro shittery. Calling a major event because you haven't hit a sales target should keep the C-Suite up at night, sure, but calling in techs and devs and 'sacrificing the L and the B in Work Life Balance'? The C-Suite should be making the strategic decisions to reverse the decline, not suddenly drag everyone into a meeting to fix their lack of foresight and working towards an end that the average tech/dev cannot influence.
I've been a major incident manager. It works great to have a hardass like me giving orders on a conference bridge: stuff gets fixed really fast! People pull out laptops at little league games and get it done. The outage is solved, and business goes on. I look like a genius.
The problem is that, if a major incident is the only process in your organization that results in things getting done in a reasonable amount of time, there is a large temptation to declare them for everything. Somebody wasted 6 months failing to renew our contract with the payment processing vendor because other stuff was being prioritized above it? Open a major incident and we'll get them to call the vendor CEO at home on Sunday. We're a big account, so they'll answer. It's the end of the fiscal year and if we don't decomm this expensive system like was promised, a VP somewhere will lose their bonus? Declare a major incident and ElevenLathe will get it done in a few hours, even if it is Thanksgiving.
I would page on-call people at any hour of the day for real emergencies without hesitation, but always asked for orders in writing to page people (not executives, but real people) who weren't on call, or for nonsense like this. This got annoying so they started assigning other incident managers to the nonsense.
Not all catastrophes are immediate in nature. Most organizations have incident response protocol which works well enough for an immediate catastrophe. Non-immediate, gradual threats can represent a much greater risk, because there's no inflection point (beyond "way too late") where the threat becomes so imminent as to trigger that response. Code Yellows are a mechanism to artificially force that inflection point before it is too late.
Of course, but if it's not immediate, why does it warrant forcing people into a Zoom meeting on a Saturday while you're at your kid's Tae Kwon Do competition? If it's not immediate, why do you have to "encourage the team to sacrifice the ‘L’ and ‘B’ from Work-Life-Balance"?
Either it's an immediate catastrophe which requires immediate action, or it can be handled by giving it absolute priority during regular business hours.
I've had code yellows before, and they were never "immediate meeting on saturday". They are essentially projects that take precedence, and gets rid of the "who's job is this" question. I think Code yellows work great.... this guy is not using them how I would expect.
The etymology is not green/yellow/red. It's just not-Yellow or yes-Yellow. See Stephen Levy's In The Plex (2011) pg186:
“A Code Yellow is named after a tank top of that color owned by engineering director Wayne Rosing. During Code Yellow a leader is given the shirt and can tap anyone at Google and force him or her to drop a current project to help out. Often, the Code Yellow leader escalates the emergency into a war room situation and pulls people out of their offices and into a conference room for a more extended struggle.”
> The etymology is not green/yellow/red. It's just not-Yellow or yes-Yellow.
Um, no.
Today, Google has Code Reds, Code Yellows, Code Purples, and Code Greens... and this is after standardizing to remove other made-up terms like Code Mauve.
Not sure why it's getting downvoted, definitely experienced Code Yellows, Code Reds and Code Purples at Google. Red is (obviously) worse than Yellow and IIRC was a total code freeze for a period of time. IIRC there was a Code Red around memory at some point because the supply chain was literally so backed up that google couldn't get enough DIMMs and reasonably sized services had to stop deploying because there wasn't enough compute capacity.
Purple is/was "developer experience is so bad we need to stop developing new functionality and make the current functionality usable."
- Code red: the situation is actively causing active business harm.
- Code yellow: the situation will cause irreparable business harm if not addressed in the next 3-6 months.
- Code purple: the situation will cause business harm if not addressed in the next year.
- Code green: things are not at risk of causing problems, but we still want to make sure we make progress.
At Google, all of these priority codes need senior VP signoff, which is to say that it is actually an existential threat to one of our main product areas (e.g. Search).
I only remember the RAM crunch (2020 or maybe 2021) being a code yellow, but it's possible it was downgraded after the first month or two.
Code reds don't always have to be met with a total code freeze, but they generally do preempt all work outside of incident response.
The point of a code yellow should not be to punish the team, and an appropriately-declared code yellow should be met with significant introspection from leadership about how we got into this mess and what we need to do to prevent us from getting into this mess in the future. It's a blunt tool that allows the organization to dictate that it's going to drop its existing commitments on the floor because they are simply less important than fixing the systemic problem.
You don't need a code yellow to try multiple things in parallel, or to ship a prototype without worrying about scale. A startup certainly doesn't need a code yellow to empower individuals to wear multiple hats. And if your team is spending 50-75% of its time on keep-the-lights-on work, then your systems are being held together by duct tape, and this is simply not sustainable.
There's definitely a "delusions of grandeur" thing that happens with some startups. "Come join our incredible journey to change the world by micro-optimizing ad placement on image boards."
It doesn't matter if it's literally stealing -- it's the mentality that is important here.
Sure, you could always quit -- but the thing is, if you're pushing for these kinds of policies, you may very well push your best talent to accepting that option!