More

m104 · 2025-12-09T19:15:53 1765307753

To your first point, yes we're moving slowly towards a more general awareness that most employees are paid market (replacement) rate, not their share of value generated. As the replacement rate drops, so will wages, even if the generated value skyrockets. Unsurprisingly, business owners and upper management love this.

To the second point, the race to the bottom won't be evenly distributed across all markets or market segments. A lot of AI-economy predictions focus on the idea that nothing else will change or be affected by second and third order dynamics, which is never the case with large disruptions. When something that was rare becomes common, something else that was common becomes rare.

m104 · 2025-08-10T16:09:26 1754842166

The anecdote about the 16-pin religion and the reluctance to use more pins is so good. It's often assumed that (later) successful companies were always making fantastic decisions in the earlier days, when in reality there were a few bizarre and harmful assumptions that were holding it back and needed to be forced out in order for rationality to prevail.

rasz · 2025-08-10T20:55:26 1754859326

To be fair packaging used to be very expensive in US. I remember one of Asianometry? videos touching on Japanese businessman traveling to Texas in ?seventies? and learning how expensive lead frames were while he could manufacture and ship them overseas at fraction of the cost. Sadly I cant find that specific episode anymore :(

m4rtink · 2025-08-11T10:56:59 1754909819

I think it might be this one ?

https://www.youtube.com/watch?v=nNpuiJitKwk

rasz · 2025-08-11T12:26:44 1754915204

This one is about history of packaging technology. The one I remember was about particular Japanese company, similar to "YKK: Japan’s Zipper King".

Tuna-Fish · 2025-08-11T08:21:15 1754900475

The reluctance to use more pins is very understandable.

At the time, Intel was primarily a memory manufacturer, and they had vertically integrated the complete workflow for anything that could fit into a 16-pin DIP. Anything that didn't, required them to outsource testing and packaging, or purchase expensive new machines. When CPUs were still being pushed against the wishes of upper management ("A computer has only one CPU but lots of memory chips, so the memory is a better business"), it was a hard sell to invest lots of money for an uncertain market.

m104 · 2025-06-07T19:44:21 1749325461

It's the "impact" style of technical write-ups: sell the problem and the scale, then present the solution, which is thus presented and understood through the lens of business and customer success.

Generously, this writing style is supposed to show the business value of teams and individuals, for promotions or other recognition. But yeah, it can be frustrating to read this style.

m104 · 2025-03-12T04:32:00 1741753920

Oh lordy, the "two crews" bifurcation fully written down. What a fantastic way to ship until it becomes far too expensive to ship anything good.

Look, when we break the feedback loop back to the people who wrote the software in the first place, they get happier for a bit, you make some other people sadder for a bit, and then slowly your feature crew never want to be interrupted or bothered again and your customer crew can't get enough resources to fully fix anything.

Worse, your feature crews aren't learning anything beyond how to get those lines out the door, which will somehow get slower and more expensive as time goes on. Why? Because you removed the one fitness function on good software development which is to fully re-incorporate the negative feedback back into the source of development.

A real CTO leadership handbook would say clearly "it's your responsibility to help your developers improve, especially while shipping, and they're not always going to be happy about it."

pyrale · 2025-03-12T09:01:09 1741770069

> It allows your feature team to remain 100 percent focused on the future, undistracted by customer support work.

AKA "it allows your feature team to be completely oblivious to the horrors they unleash, and keep at it until the ship is solidly planted in the iceberg"

Not talking about the conflicts it creates for merging between sales-supported feature teams and customer rep-supported maintenance teams. Given that the "customer crew" is described as something you grow out of, there's no question who wins arbitrages.

> It provides another career path for individual engineers, especially junior engineers, to learn and level up on your team.

"Senior staff doesn't want to fix shit so we have juniors do it"

SkyPuncher · 2025-03-12T13:26:13 1741785973

Further, I'm not sure what efficiency it provides overall. Is dedicating 20% of your team to support _that_ much different than the entire team spending 20% of their time on support?

We've actually found our quality goes up massively when we force our engineers to deal with the problems in the features they ship, directly with customers. We still have dedicated front line support (that rotates weekly), but they run off a playbook for common support needs then delegate everything else out.

It really sucks when you get pulled into support a feature you launched, but it really makes you want to build your next features better. Better internal documentation, better customer documentation, better UX/requirements, better edge case handling, etc, etc.

quacksilver · 2025-03-12T14:17:50 1741789070

Would putting some percentage of the team on 'support' for a week or two help with reducing task switching and help to allow deep work? Maybe everyone in the team would spend 2 weeks per quarter or something like that doing support.

I (n=1) would prefer to be answering support tickets for 2 week blocks, and know when the blocks are in my calendar, so that I can plan work around them, rather than trying to debug something while I am being pinged about unrelated stuff all day.

SkyPuncher · 2025-03-12T19:37:53 1741808273

It's pretty hard to be fully hands off of customers. That being said, we don't expect immediate replies unless (1) you're the front line support for the week (2) something is on fire. We also don't expect immediate replies. Generally, within 24 hours is acceptable.

It's a bit of a drag, but most people just deal with their occasional support needs at natural context switches. First thing in the morning, before they head out, in-between meetings, etc, etc

dowager_dan99 · 2025-03-12T16:30:54 1741797054

hand-off needs to be really carefully defined and managed efficiently. Client tickets rarely respect arbitrary calendars, and the context switching alone can be really expensive. The best I've seen is a primary/secondary setup where you move from copilot to pilot, so you're not coming into everything cold

pc86 · 2025-03-12T14:40:40 1741790440

> Is dedicating 20% of your team to support _that_ much different than the entire team spending 20% of their time on support?

Yes, it's [much] worse. Because nobody wants to be the support crew, so you end up with the 20% most junior, least outspoken people. Then the other 80% cares less about what support requirements will come out of the code they're writing because it's not their problem.

It's the perfect scenario for the aggressive prima donna who thinks their code is golden and everyone else's is dogshit.

I feel strongly that your front-line support should be full-time (not rotating) front-line customer support. That should be their job. If I reach out to a company for support I don't want my first contact to be with someone who writes code 95% of the time and this is their one week answering Zendesk tickets. I want it to be someone whose entire job is fielding customer issues and resolving them quickly and efficiently.

BossingAround · 2025-03-12T15:22:18 1741792938

That's why you rotate everyone, not just those that "volunteer"... This way, you're spreading knowledge to everyone, e.g. if I'm forced to deal with an issue on code you wrote, I'm forced to learn about it.

Of course, I might have to ping you and get you to help me with it, so it's less efficient. Then again, if you leave the company, I have some knowledge about the feature, so... There's tradeoffs for sure.

dowager_dan99 · 2025-03-12T16:31:53 1741797113

I believe you're referencing the Engineering Management principle of "share shit work evenly".

n4r9 · 2025-03-12T10:56:14 1741776974

Further down the article:

> The Microsoft blog post referenced above recommends swapping some team members between the two crews every week.

This would hopefully mitigate the worst of the effect you describe, since everyone eventually gets exposed to the consequences of poor feature development.

pc86 · 2025-03-12T14:38:42 1741790322

I don't know about you but it's rare that I've neatly wrapped up my tasks at the end of any given week. Single-day tasks are rare, there is always carry-over work including over the weekends.

The only thing worse than a feature that got rushed out the door Friday afternoon because you had a completely different role come Monday is one that was 80% done then passed off to someone else because you had a completely different role come Monday.

BossingAround · 2025-03-12T15:13:51 1741792431

In my company, we rotate every 5 months. So every 6th month, I get put into the customer-facing team for 1 month. Every other month, a different team member is on the customer-facing team.

This is still annoying, but gives you enough time to work on features, and enough time to try and crack some customer cases (though I could even see being in the customer-facing team for more than 1 month, as sometimes, this is not enough to debug the issue and provide a fix).

I've got to admit, as much as I dislike being on the customer team, it's certainly less annoying than working on features, and have constant customer issues interruptions though.

djeastm · 2025-03-12T17:14:35 1741799675

Maybe when you know you're due to start support the next week, you stop feature work sometime the previous week and do small maintenance/backlog tasks and or documentation. Like a cooldown period before task switching

DanielHB · 2025-03-12T08:28:38 1741768118

Related topic, but every company I worked at that had a platform team (as in a third-crew support team that manages tools/practices/common-code for a discipline) ends up being infested with over-engineering.

They tend to attract that kinda of people who have disdain about delivering features and fixing bugs and like to over-abstract problems. Instead of fixing bugs they try to create increasingly complex abstractions to prevent the bugs from happening in the first place, with obvious results.

Aurornis · 2025-03-12T13:34:38 1741786478

That has been the fate of every platform team I’ve worked with in recent years.

Then they become gatekeepers, refusing to allow anything on their platform unless it conforms to their ideal vision. The catch is that their vision won’t be ready to use for 6-12 months, so you can’t deploy. Now your biggest problems aren’t engineering, it’s constant politicking to get around the platform team.

Add to this the concept of “architects” who don’t code but jump from team to team critiquing their work and you have a recipe for getting nothing done. One half of engineering is coding and trying to ship, and the other half of engineering is gate keeping and trying to prevent anyone from shipping

fatnoah · 2025-03-12T14:32:50 1741789970

> Then they become gatekeepers, refusing to allow anything on their platform unless it conforms to their ideal vision

As the owner of a platform team, this very common attitude of platform teams kills me. Yes, we have a long-term vision that we're working towards, but our main goals are two accelerate developers AND produce more robust systems. Outside of totally egregious violations of company standards, my team is expected to focus on how to get things done. That means being flexible, working side-by-side with other teams, etc. to make sure that a) they're able to deliver what they need and b) we help them build it in such a way that it can eventually be aligned with our utopian long-term vision.

Aurornis · 2025-03-12T14:54:40 1741791280

That’s exactly how every platform team starts. There is inherent tension between accelerating developers and building their own systems, though.

In my experience, the platform teams developed an idea that their conceptualized system would accelerate everything once it was done, but working with product teams was a distraction from getting it done. They also didn’t like the idea of deploying something now and then having to rework it later when their ideal system was ready. So they defaulted to gate keeping, delaying, and prioritizing internal work over requests from the product teams.

The only way to get things done was to leverage management chains to put pressure on the platform team to prioritize getting your thing deployed. This was constant no matter how much headcount the platform team received because with every new hire they developed new ideas and goals to add to their internal roadmap.

It’s not supposed to work like this, but it plays out this way in many companies.

fatnoah · 2025-03-12T15:55:54 1741794954

> It’s not supposed to work like this, but it plays out this way in many companies.

Absolutely, and I've been on both sides. We go much more with a carrot approach than a stick approach, and have no ability to "block" any product team from doing things. Our goal is to ship things that are useful and lower the effort required for product teams to ship their products, which is handling basically everything except product-specific features. However, product teams don't have to use the platform, but then they own the operational burden of whatever custom stuff they're using. When that happens, we still work with them to minimize that or bake that capability into the platform and eventually take it over if it's useful to the wider org.

"Success" of the platform team really depends on serving the product teams, so blocking or being a barrier goes very much against that. We try to provide opinionated golden paths, but also try to build a properly abstracted stack of capabilities so teams can also extend/consume at a lower level if that better suits their needs.

charlie0 · 2025-03-12T22:10:24 1741817424

There probably isn't any middle ground in practice then. If the product teams have control, then the tech debt just keeps building as they keep prioritizing new features over longer term maintainability. I see it already happening at my startup where product has a lot more influence than engineers in terms of what goes on the roadmap (there's practically zero time devoted to lowering tech debt.)

chasd00 · 2025-03-12T16:22:54 1741796574

i think this is where a large portion of the tech consulting market comes from. Someone in business gets absolutely fed up dealing with IT and trying to get something they need in production. Next, they go find a budget, call a couple firms, get some proposals, pick one, and do it themselves.

incognito124 · 2025-03-12T14:33:06 1741789986

That's actually the "premise" of Google's SRE book

neumann · 2025-03-12T12:50:22 1741783822

argh! PTSD - This was exactly what happened at my last start-up. Two of the engineering team and one from the R&D team started a platform team and it became a pre-PMF product with the slickest pipelines, DevOps, Cloud-cost optimization ready to scale to infinity. But with no customers, a broken front-end, and a under-funded R&D team as all the effort was put into the essential SaaS Platform. Truly put the company back 1 year while burning two.

DanielHB · 2025-03-12T15:34:53 1741793693

That is actually usually not that bad (if there is, you know, revenue). What is really bad is when those teams start to roll out a lot of custom code that other teams need to use. If they are just configuring standard tools for everyone else it is usually fine (as long as they are not going to crazy with it).

bluefirebrand · 2025-03-12T16:16:44 1741796204

The "platform" team at my company has rolled out a completely custom query language that we have had to learn and write so they don't have to make new endpoints to access different combinations of data

And they haven't documented anything

"There are integration tests, those are documentation go read those"

Good times

SketchySeaBeast · 2025-03-12T17:19:56 1741799996

That's really the best, when not even intellisense can help you.

DanielHB · 2025-03-12T18:26:38 1741803998

Yes, this is exactly what I mean.

actionfromafar · 2025-03-12T10:41:39 1741776099

I wonder though if there aren't more forces at play. For instance, the business problems some systems try to solve really are so large and complex, you might need some kind of overseeing function in your company.

Also I have a hunch a team dedicated to providing helper "libraries" more than than "frameworks" could provide a lot of value without so much downside. If you can call a library function without it imposing a whole framework on the rest of your codebase, it's more self-contained and can't spill its abstractions all over the place.

DanielHB · 2025-03-12T15:23:14 1741792994

If your org starts a platform team it is really important to have this concept drilled on early. Buffet, not framework.

I clearly remember having some discussions with platform people in my last job and asking them "why should I use your solution instead of getting an open source one that is likely better tested and used by more people" and the answer was usually "we can help if you run into any problems". Well, the "help" is to be planned and prioritized in the next sprint and probably will only come next quarter. So now the devs in my team need to make PRs to the platform people code and beg for reviews, how is that better than using the open source?

SketchySeaBeast · 2025-03-12T13:16:27 1741785387

This was the first place I worked at. The platform team became more and more insular and detached and more and more convoluted. As a result, things got harder to add on and soon they were telling the implementation teams that the features that the clients were requesting couldn't possible be needed. Million dollar contracts but no, you don't need to be able to put links into text blocks, that's a stupid feature and the client can't possibly want it.

franktankbank · 2025-03-12T13:31:34 1741786294

insular architect waves hands These are not the features you are looking for.

hnthrow90348765 · 2025-03-12T13:10:25 1741785025

>Look, when we break the feedback loop back to the people who wrote the software in the first place, they get happier for a bit, you make some other people sadder for a bit, and then slowly your feature crew never want to be interrupted or bothered again and your customer crew can't get enough resources to fully fix anything.

This is the PM's job - one or a few people who are deciding the vision of how all of the features fit together based on feedback by working with customers. Customers (esp. non-technical ones) will definitely not have a coherent product vision and only want immediate fixes regardless of what else may be planned. Customers may also not communicate to one another and their feedback can conflict.

If you put this burden on developer shoulders, they now have to manage all of that communication in addition to requiring technical skills to know the code base and maintain it well, on top of every developer needing to have the same coherent vision to make thoughtful decisions. That's now two to three jobs in one depending if your developers also manage infrastructure like many roles are requiring these days.

marcinzm · 2025-03-12T13:42:59 1741786979

What you're describing is exactly the opposite of every actually successful team I've seen, and describes every mediocre team I've seen. Silos are death and not just in a code base. Good developers understand the product. Mediocre ones churn out tickets mindlessly.

hnthrow90348765 · 2025-03-12T14:55:38 1741791338

I'm okay with that knowing those developers are doing two jobs for the pay of one. And most products turn into that once the original developers leave.

It's not like you can't learn the product through the PM either.

pc86 · 2025-03-12T15:05:35 1741791935

I think you're conflating "doing two jobs" with "not being allowed to just type JavaScript into a computer all day in isolation and being expected to actually communicate and think about things other than data structures and algorithms."

If you're a true senior software engineer as most of us claim to be coding is a small part of your job, not your entire job.

You should be learning the product through the PM for sure, and I don't think a senior engineer should be doing first-level support, but especially in small companies talking to customers is good and should be expected from basically everyone who is working on the product.

hnthrow90348765 · 2025-03-12T15:34:26 1741793666

Let's flip this around and see if it still fits:

"the PM can't be expected to sit in meetings all day, they need to learn the coding side of it too so they know the potential limitations of the features they want to suggest"

But if a PM does have a technical question, they don't need to go google stuff and figure it out - they ask a developer.

Likewise, when a developer has a product question, why can't they rely on a PM to answer that for them? Why must we also be expected to be in customer meetings and putting in extra effort, when PMs definitely won't put in effort to learn the technical side?

tristor · 2025-03-12T17:19:07 1741799947

Yes, it still fits when you flip it around, speaking as an engineer turned technical PM. PMs should absolutely be technical and have enough depth of understanding about the product they can figure things out for themselves, as well as write code.

That's not going to prevent the PM from asking questions to the developers though. I ask questions all the time, because I want to validate my mental model with others and verify my understanding. Asking questions is a /good/ thing.

The part where you are missing the boat is acting like customers are a distraction or an enemy. Customers are /the point/, the /only/ point, really at the end of the day. Every role in every business is customer-facing to some degree.

marcinzm · 2025-03-12T17:57:22 1741802242

> But if a PM does have a technical question, they don't need to go google stuff and figure it out - they ask a developer.

In a good organization they first try to figure it out themselves versus distracting a developer (thus costing possibly hours of productivity due to breaking someone's flow). The same way a developer would first try to answer their own question before they start badgering another developer.

marcinzm · 2025-03-12T17:54:55 1741802095

Unless you're working 80 hours a week you're not doing two jobs. You're doing one job.

hnthrow90348765 · 2025-03-12T18:44:40 1741805080

Giving away flexibility for free is a collectively dumb move on our part. If someone knows you can take on coding tasks and customer interviewing vs. just coding, you are more valuable to them and they should pay more for it.

They've already gotten away with adding infrastructure and architecture (aka system design) rolled into one developer position. And putting it behind long and stressful interview processes. I'm not doing PM stuff on top of all that and not getting the pay and prestige for it.

marcinzm · 2025-03-12T20:33:13 1741811593

Why do you assume it's for free? The compensation of a software engineer varies widely. The same experience can get you anywhere from $100k to $1+m.

Perpetually doing less in fear of not getting paid enough for doing more is how you get paid a pittance while complaining about it constantly. Doing more and then finding a way to get paid more is how you get paid more and be happy.

regularjack · 2025-03-12T16:58:16 1741798696

In my experience, both are needed. Product owners and developers who understand the product. It's possible to have both, they're not mutually exclusive.

swat535 · 2025-03-12T19:34:06 1741808046

Yes, but it's the Product Owner's responsibility to clearly understand the requirements from both customers and the business and communicate them clearly to engineering.

Having engineers handle "support calls" doesn't make much sense, they are not equipped to manage product feedback or understand the business implications.

marcinzm · 2025-03-12T17:52:01 1741801921

You're right, I didn't mean there shouldn't be PMs but rather that the PMs shouldn't be the sole people concerned with product.

borski · 2025-03-12T08:08:41 1741766921

One thing that we did to account for this was to shift the teams every or every few sprints. It allowed folks to get more experience, still get feedback, since if they built a buggy feature they'd have to fix it, etc.

People seemed much happier with that, because they also didn't get tired of 'always fixing bugs' or never getting the feedback, which you insightfully mentioned.

madeofpalk · 2025-03-12T09:27:02 1741771622

Developers must run and maintain the software they build. It's as simple as that.

dowager_dan99 · 2025-03-12T16:34:24 1741797264

The best development teams WANT this.

They will readily take on the responsibility to get the autonomy. The problem is many companies give the former without the latter...

x0x0 · 2025-03-12T16:27:57 1741796877

And take the calls.

Don't like being paged at 3am? Write robust software and test.

JohnCClarke · 2025-03-12T09:50:14 1741773014

Well, what's the time horizon? A PE backed outfit, or a CTO looking to move on within a year or so, would be well advised to follow this guidance. Lots of success now, and the problems deferred to later.

plomme · 2025-03-12T12:10:40 1741781440

The book mentions having a rota:

> Engineers rotate between the crews on a regular basis. The Microsoft blog post referenced above recommends swapping some team members between the two crews every week.

In my experience this works well. With my current and previous client each team had a "hero of the week", whose responsibility was second line support and monitoring. If nothing came up the hero would work on their tasks as usual.

If something does come up the heroes of the week would be tasked with solving it or pulling in someone who knows how to solve it. This leads to engineers both having to accept accountability for writing shoddy code, but it also exposes engineers to the wider codebase when pulling on threads. It also solves the issue where no-one or the same person always takes responsibility for handling bugs.

dowager_dan99 · 2025-03-12T16:35:46 1741797346

This just sounds like having a point developer. The challenge is too many companies expect this without giving up a feature-dev headcount. Any work the get done aside from point is a bonus and unplanned.

marcinzm · 2025-03-12T13:08:04 1741784884

Isn't this just called on-call? That's very different from a separate team.

plomme · 2025-03-12T13:31:03 1741786263

Maybe - though I associate being "on-call" with being expected to respond outside of normal business hours which was not the case in the teams I worked in.

bravetraveler · 2025-03-12T10:07:38 1741774058

This may put me and my peers out of work (in a good way). SRE is a consequence of this function being lost, IMO. Pattern: developers don't like it? Give it to Ops/SRE.

Take away the escape, we will all be better for it.

bboreham · 2025-03-12T08:11:26 1741767086

I call them “shiny team and shitty team”.

esafak · 2025-03-12T14:11:57 1741788717

It is not a problem if you measure and reward the infra team for their ability to enable the feature team, such as change lead time and deployment frequency, as well as the the stability metrics that the infra team might want to pursue.

m104 · 2025-03-09T18:42:25 1741545745

Because a lot of tech workers today aren't actually job hopping and instead get very cozy in a job and a team and a career trajectory, which feels unfairly ripped away during layoffs for reasons that don't feel connected to their personal performance.

m104 · on Dec 7, 2024

Only solution: sting operation

m104 · on May 17, 2024

A few of extra considerations picked up over many years of hard lessons:

1. Rate limits don't really protect against backend capacity issues, especially if they are statically configured. Consider rate limits to be "policy" limits, meaning the policy of usage will be enforced, rather than protection against overuse of limited backend resources.

2. If the goal is to protect against bad traffic, consider additional steps besides simple rate limits. It may make sense to perform some sort of traffic prioritization based on authentication status, user/session priority, customer priority, etc. This comes in handy if you have a bad actor!

3. Be prepared for what to communicate or what action(s) to perform if and when the rate limits are hit, particularly from valuable customers or internal teams. Rate limits that will be lifted when someone complains might as well be advisory-only and not actually return a 429.

4. If you need to protect against concertina effects (all fixed windows, or many sliding windows expiring at the same time), add a deterministic offset to each user/session window so that no large group of rate limits can expire at the same time.

Hope that helps someone!

dskrvk · on May 17, 2024

> add a deterministic offset to each user/session window so that no large group of rate limits can expire at the same time

Did you mean non-deterministic (like jitter)?

refibrillator · on May 17, 2024

GP meant deterministically add jitter.

Long ago I was responsible for implementing a “rate limiting algorithm”, but not for HTTP requests. It was for an ML pipeline, with human technicians in a lab preparing reports for doctors and in dire cases calling their phone direct. Well my algorithm worked great, it reduced a lot of redundant work while preserving sensitivity to critical events. Except, some of the most common and benign events had a rate limit of 1 per day.

So every midnight UTC, the rate limit quotas for all patients would “reset” as the time stamp rolled over. Suddenly the humans in the lab would be overwhelmed with a large amount of work in a very short time. But by the end of the shift, there would be hardly anything left to do.

Fortunately it was trivial to add a random but deterministic per patient offset (I hashed the patient id into a numeric offset).

That smoothly distributed the work throughout the day, to the relief of quite a few folks.

solatic · on May 18, 2024

> Rate limits don't really protect against backend capacity issues

Yes and no, there's a little more nuance here. You're correct that the business signing up X new users, each with new rate ~limits~ allocations, does not in and of itself scale up your backend resources, i.e. it's not naively going to vertically scale a Postgres database you rely on. But having a hardware rate limiter in front is like setting the value on the "max" setting on your autoscaler - it prevents autoscaling cost from skyrocketing out of control when the source of the traffic is malicious/result of a bug/"bad"; instead a human is put in the loop to guage that the traffic is "good" and therefore the rate limit should be increased.

> Rate limits that will be lifted when someone complains might as well be advisory-only and not actually return a 429

How does one set an "advisory-only" rate limit that's not a 429? You can still return a body with a 429 with directions on how to ask for a rate limit increase. I don't think of 4xx as meaning that the URL will never return something other than a 4xx, rather that the URL will continue to return 4xx without human intervention. For example, if you're going to write blog.example.com/new-blog-entry, before you publish it, it's a 404, then after the blog post is published, it will return a 200 instead.

abrahms · on May 26, 2024

You could set it in a header (w3c baggage?) which is monitored by a dashboard, as an example.

IgorPartola · on May 17, 2024

What exactly do you mean by your first point?

m104 · on May 20, 2024

I often see rate limits framed as a way to protect system capacity, but not enough follow-through with understanding why, even with appropriate rate limits, systems can fall over and require manual intervention to restore.

The easiest way to explain this is with a simple sequence of events: the database has a temporary issue, system capacity drops, clients start timing out or getting errors, load amplification kicks in with retries and request queueing, load is now higher than normal while capacity is lower than normal, devs work hard to get the database back in order, database looks restored but now the system has 3x the load it did before the incident, other heroic efforts are needed to shed load and/or upscale capacity, whew it's all working again! In the post-mortem there are lots of questions about why rate-limiting didn't protect the system. Unfortunately, the rate limit values required to restore the saturated system are far too low for normal usage, and the values needed for normal operation are too high to prevent the system from getting saturated.

Fundamentally, there's really no way for a rate limiting (which only understands incoming load) to balance the equation load <= capacity. For that, we need a back-pressure mechanism like circuit breaking, concurrency limiting, or adaptive request queueing. Fortunately, rate limiting and back-pressure work well together and don't have to know about each other.

trevor-e · on May 17, 2024

A rate-limit is most often some arbitrarily configured static value, e.g. each org can make 10 req/s. It's much harder to configure rate limits against some dynamic value like system load so most go with the static value approach.

Like the OP said, this doesn't protect you from going over your system capacity, you can still have 10 million orgs all requesting 10 req/s which can take down your system while abiding by your rate limits.

throwaway63467 · on May 17, 2024

Maybe that per-customer rate limits don’t guarantee the whole backend won’t go over capacity? Though I guess many apis will have global rate limits as well for these cases.

jgalt212 · on May 17, 2024

> protect against backend capacity issues

That's our primary use case, so I am also curious to hear more.

m104 · on May 20, 2024

Answered above!

foota · on May 17, 2024

Great advice!

Ideally, you can provide isolation between users on the same "tier" so that no one user can crowd out others.

m104 · on Sept 21, 2023

One aspect of this type of problem I missed from the article is whether the data mutations were applied evenly across transaction time. Data sets like these tend to be very active for recent transactions, while the updates fall off quickly as the data ages. If that's the case, applying a single query caching solution may not be a good fit and may always suffer from major tuning/balance issues.

If the data is in fact updated with clear hot/warm/cold sets, caching the cold sets should be extremely effective, the warm set moderately effective, and it may not even be worth caching the hot set at all, given the complexity proposed. Additionally, you should be able to offload the cold sets to persistent blob storage, away from your main database, and bulk load them as needed.

Finally, it can be faster and simpler to keep track of deltas to cold sets (late mutations that happen to "invalidate" the previously immutable data), by simply storing those updates in a separate table, loading the cold set data, and applying the delta corrections in code as an overlay when queried. Cron jobs can read those deltas, and fold them back into the cold set aggregations, making clean validated cold set data again.

Great article, BTW! There are entire database technologies and product dedicated to addressing these use cases, particularly as the data sets grow very large.

m104 · on Feb 20, 2022

Notion has been a solid A- for shared to-dos, bookmark lists, recipes, trip planning details, etc, which covers most of our shared lives. The main issue with Notion is that mobile syncing can be very slow when we're out in the world, so using it to distribute a shopping list while we're at a grocery store, for example, has been unreliable.

For actual scheduled events, we just share our personal Google calendars with each other and invite each other to mutual events. No issues there.

The largest technical issue currently is just organizing personal media in a reasonable way. We're in the in Apple ecosystem so shared albums have generally worked for photos and videos, but it's clunky at best and doesn't cover other media or documents.

m104 · on Feb 4, 2022

I recommend Russell Ackoff's writings as somewhat related and more to do with how systems of people and processes work (or don't). Here's a great place to start: https://thesystemsthinker.com/a-lifetime-of-systems-thinking...