Hacker Newsnew | past | comments | ask | show | jobs | submit | uji's commentslogin

AWS/Amazon might be great for customers but it's a horrible place to work. Having worked in AWS for 3 years, almost all services are half baked, tech debt filled in all parts of the code. But hey, we never see any issue? It's because it has army of oncallers who are manually running commands and fixing issues.

I used to work in one of the DB services and we used to get 20+ pages (sev2) every day. Due to insane amount of pages every day, we used to have daily on-call rotations.


High Level Features:

1. VMs are dual-stack.

2. Subnets can have either GUA or ULA addressing along with IPv4.

3. Subnets are /64 and VMs get /96 address.

4. ULA addressing is useful for intra VPC traffic.

5. GUA addressing is useful if internet facing connection is required.

6. VMs can have multiple IPv6 NICs.


Looks like pretty good news. Have worked in AWS before. So AWS is very famous for making money using open source products without contributing upstream.

One very good example is Amazon redis. Amazon figured out that redis asynchronous replication didn't work at scale so instead of fixing issues upstream they chose to develop Amazon redis in house and monetized it.

https://aws.amazon.com/memorydb/

Enhanced version means patched made by AWS. https://aws.amazon.com/elasticache/redis-details/


Disclaimer: ex-AWS as well, worked very closely with MemoryDB, but not paid to shill!

You're entitled to your opinion but your line of reasoning for how MemoryDB for Redis came to exist or the reasoning about why it isn't in upstream Redis is not factual. MemoryDB's architecture uses Amazon's home grown log replication services as pointed out by Werner Vogel in his blog post about MemoryDB[0]. This architecture is fundamentally incompatible with upstream Redis. The real reason why MemoryDB for Redis exists is far less juicy: MemoryDB came about by meeting customers where they were. Customer's love Redis, especially for its non-caching features, but replication is a headache with all the existing solutions today.

Also as far as I know one of the lead committers to Redis is from AWS.

0 - https://www.allthingsdistributed.com/2021/11/amazon-memorydb...


“meeting customers where they were”

I’m a total AWS fanboy, but even I cringe when I read that superficial, customer-centric sound bite. You know who meets people where they are? FOSS maintainers.

“Also as far as I know one of the lead committers to Redis is from AWS”

Conveniently cryptic, what does from AWS mean? Do they still work there? Did they specialize in log replication?


Disclosure: I work for AWS, but I don't work on Redis or managed services based on Redis.

Madelyn Olson, at present an AWS employee, is one of the members of the Redis core team [1]. The invitation was extended because she had "been actively involved in Redis development for several years, contributing numerous changes throughout Redis, including bug fixes and features."

[1] https://redis.com/blog/redis-core-team-update/


Aren't FOSS maintainers famously "my way or the highway"?


Incompatible or not isn't the point. Not releasing is what it's about. And yes, they can. But that doesn't make it right.


What's the point of releasing a chunk of code that relies on internal AWS infra?


I think it’s the fact that the implementation details don’t change the optics; AWS had the resources and talent to both monetize AND be a Good Samaritan by contributing to the upstream. The contributions could have been RFCs, GitHub issues, etc. With that said, I can imagine the rumored culture of AWS engineering doesn’t support wondering souls that typically nurture thoughts of community service for the upstream.


Disclosure: I work for AWS.

There were contributions sent upstream for Elasticsearch bugfixes and enhancements. Some of those PRs are still open, for example [1].

A sampling of additional PRs can be found in this blog post [2].

Further contributions upstream to Apache Lucene have been growing over the years. The new Approximate Nearest Neighbor support in Elasticsearch 8.0 comes from work that was sponsored by Amazon in upstream Apache Lucene [3].

[1] https://github.com/elastic/elasticsearch/pull/64513

[2] https://aws.amazon.com/blogs/opensource/stepping-up-for-a-tr...

[3] https://issues.apache.org/jira/browse/LUCENE-9004


Well isn’t it convenient that it requires AWS infra. And there is absolutely no way they could have designed it differently.


AWS had proprietary infrastructure that could solve the problem at hand; it's not reasonable to expect them to invest an incredible amount of effort to solve it again. AWS customers do not net-benefit from that duplicative effort.


So.. Redis had a choice of licenses: BSD (improvements do not need to be contrbuted back) or GPL (improvements do need to be contributed back).

For whatever reason they chose BSD. And now Amazon made some improvements and is not contributing back.

Not sure why anyone is surprised.


GPL wouldn't prevent Amazon from forking and then providing managed instances of the project without releasing the source code of the fork. AGPL would, though.

But yes, this is the reason I won't work for free on my own or others' BSD or MIT licensed projects.


BSD and MIT don't grant patent rights; this might be one reason Facebook uses them, though they did try to get predatory in the only way Facebook can: https://news.ycombinator.com/item?id=14779881

ianal, but imo, Apache License v2, Mozilla Public License v2, and xGPLs v3 are better at protecting the rights of the consumers (including contributors).


Sometimes fixing upstream is hard / not possible, when maintainers don't want to accept others' proposals/vision, or are cautious to change architecture with breaking changes.


So they don't release the result as OSS because upstream wouldn't have included it?


I don't know much about this particular case. However generally speaking that is not true. They would just need to make the fork public. Whether the maintainer accepted the changes upstream is irrelevant.


Unless they actually tried to upstream it, this point doesn't really matter. From what I can see, they never actually tried or intended to upstream their changes, so this is irrelevant. They didn't even release the fork's source, let alone try to upstream it.


Ahh, the new Oracle.


No, not everything negative is the new Oracle. AWS has very little in common with how Oracle has operated historically.

Oracle didn't build their company in the style of AWS Redis, that cloning maneuver. Oracle's database was a pioneer. Oracle didn't get where they are by cloning open source and claiming it as their own. Despite the numerous bad things that can be said about Oracle's culture, that's not one of the key negatives about Oracle.


Agreed. Not an Oracle fan boy but they didn't deserve to be brought up in the context.


They killed and re-proprietarized OpenSolaris.


Had OpenSolaris developed a sufficient community outside of Sun/Oracle, this would have been much less of an issue. Which is why community is a big deal. A large company can always decide that some project is no longer interesting and, if they're the only ones supporting it, it's going to die. They certainly have no obligation to support it.


At this point in AWS' life, enterprise sales is king. Not surprising that there's shades of Oracle / Microsoft in them. May be, Google hired Oracle #2, Thomas Kurian, to head GCP for similar reasons. Like it or not, Oracle-sized shadow looms large over BigCloud.


It surprises me when people lump Oracle (or AWS for that matter) in with the likes of Microsoft.


The process to make MS-OOXML an ISO standard was a dirty one https://en.wikipedia.org/wiki/Standardization_of_Office_Open... and 2008 wasn't that long ago


Ballmer era Microsoft is a very different beast to Nadella era Microsoft


I don't see Microsoft jumping to fix all the interop issues their software has with ODF.


Why? They're all enterprise software companies. They participate in open source to greater or lesser degrees. They seem absolutely part of the same category. This isn't either positive or negative commentary but just observation of what they do as businesses.


In that Microsoft is way worse or way better? I’m not clear, because they’re all pretty scummy.


You are right Microsoft could be a more difficult entity to deal with.


A side question: is there anything licensed AGPL that AWS uses? Either AWS releases the code or found a way around the license?


Yes, but AFAIK it isn’t modified.



Still waiting in my region (they only support asia-east1, asia-south1, europe-west2, us-west2). Hopefully some year.


yes. you can use /64 subnets to define your broader level firewall rules.


This is very true. It doesn't lead to PIP always but whole amazonian culture makes it difficult for the person to stay in team/company.

Writing COE is kind of admission of guilt and I have definitely seen promotions getting delayed. During perf-review, lot of times managers of other teams raise COE has a point against the person going for promotion.


Have worked at AWS before, and I can attest to this. Whenever we had an outage, our director and senior manager would take a call on whether to update the dashboard or not.

Having 'red' dashboard catches lot of eyes, so people responsible for making this decision always look at it from political point of view.

As a dev oncall, we used to get 20 sev2s per day (an oncall ticket which needs to be handled within 15 mins) so most of the time things are broken, its just that its not visible to external customers through dashboard.


Wow. If I were in charge, the team running a service should not be the same team who decides whether a given service is healthy. This is pretty damaging info about the unprofessional way AWS actually appears to be run.


It's funny that you point to that as the problem. The problem is more AWS' toxic engineering culture that has engineers fearing for their jobs in a way that guides their decision making. It's bad company culture, end of story.


AWS is big. Amazon is even bigger. Disgruntled people are the ones who often cry the loudest. Just because there may be teams who act like this, doesn't mean that is the case in general.

You don't hear a lot of people praising AWS, the same way you don't hear a lot of people saying how great it is to have an iPhone. If I am happy, I have little incentive to post about it, since that should be the default state.

But the matter of fact is simple. If you end up in a team like this, switch and raise complaints afterwards. Nothing stops you from it. There is no "toxic engineering culture" at AWS. The problem is that AWS makes you into an owner and that includes owning your career. That means if you feel something is wrong, YOU are expected to act. No one will do it for you. And there are plenty of mechanism for you to act.

This is the greatest benefit of working at Amazon but its also the downfall of people who are not able to own things.


> The problem is that AWS makes you into an owner and that includes owning your career.

Firing me for correctly telling customers that their services are down is not my idea of making me an owner.


You're the owner of aspects like responsibility and risk but not the owner of aspects related to financial growth (I mean, your stock options are, but that's about it).


Doing what you think is right, is not necessarily the right thing to do. This is why there is also "Disagree and Commit". There are many facets to this and I am 100% sure that you did not get fired for >correctly< telling customers... You could potentially get fired for incorrectly telling them though, if the issue was severe enough.


That sounds toxic.


>AWS makes you into an owner and that includes owning your career.

This sort of corporate jargon does not exactly instill confidence. I think I'm more concerned about Amazon's engineering culture now than I was before.


I empathize with the poster. Imagine being paid less than someone who works half as hard at another company, but more than your coworkers, to say cringe stuff like that.


"You don't hear a lot of people praising AWS"

You definitely hear a lot of people praising AWS.


This is 100% wrong, and only seeks to detail the conversation. A toxic way to think, and sets off a lot of red flags for me, essentially ruining their creditability.

  Disgruntled people are the ones who often cry the loudest. Just because there may be teams who act like this, doesn't mean that is the case in general.
Is right up there with "we don't know it wasn't aliens"


There are plenty of ways a work culture can make you utterly miserable yet you can't do anything about it. Perhaps you aren't confident enough, or things haven't yet reached the 'tipping point', or other options just aren't available to you for political reasons, lack of openings on other teams, lack of skills...

I think it's bigger than just "it's your problem, you own it". There are factors beyond your control.


As a customer I don't really care whether AWS has a toxic internal culture. I care about whether they have operational excellence and a high quality product. This information is showing cracks in operational excellence.


Guess what - most cloud providers are like that. My personal experience is with GCP where stuff can be majorly on fire and no status update for hours. Cloud SLOs are lies like a lot of other things there


My company will update their status but puts the most vague responses up. Reason is because we don’t want to appear inept when we crash the website. For example, because we ran out of disk space

Our competitors would have a field day with that


I think this is pretty typical, as often outsiders don't have the visibility into the issue to determine whether there's an issue.


The ec2 or s3 dashboards showing red literally requires approval from ajassy himself irrc

The status page is entirely manually updated.


Flipping anything to red entails significant legal and business complications. For starters you are basically admitting that customers deserve a refund for services not provided. Im not surprised that execs must be involved in that decision. You don't want random developer making a decision that could incur millions of dollars in potential loses when there are other strictly non-techincal factors to consider.


All I see in your response is, "We don't want to tell the truth because it might cost us money."

Maybe if it started costing the company actual money, it might make the investments necessary to ensure it doesn't go down in the first place.


The point is more like "we better be sure of the scale of the issue before that is communicated publicly and low level dev's on individual teams do not have that 10000 foot view of the system".

You have all the power you need to make the company change its behavior. Vote with your dollar and move to a different platform. I'm sure you have recommendations to share.


Oh, what a pipedream. If only capitalism worked how it was described in textbooks. It turns out there are much easier lower cost optimizations businesses can perform based on managing perception rather than worrying about pesky concepts like utility.


You raise an interesting point. Where I work, most of our public status dashboards update to yellow or red automatically, with only a few failure conditions requiring a manual update. It’s always made me wonder whether we’ll ever get around to implementing capitalism with some manual update only dashboards.


Given enough law suits and mistakes by dev flipping dashboards to red with a bad code change or network provider outage and your org will have a manual public facing dashboard as well.


Of course, and such a thing would never take place in a system of economics where there are no consequences for taking accountability for failures. Because I’m sure such a system exists. Right?


I never considered Amazon capitalistic given their exploitation of the USPS.

I considered them this private company subsidized by taxes.


This might be an oversimplification.

With any customer that has SLAs written into their contracts, they're not just going off your status page. They most likely have a direct point of contact and exact reporting will be done in the postmortem.

The status page is for customers for which there aren't significant legal or business complications and exists to provide transparency. In my opinion you do want "random" people at your company to be able to update it in order to provide very stressed out customers with the best information you have.

As an industry we probably should recognize this more explicitly and have more standard status pages that are like "everything might be broken but we're not sure yet"


Status pages are generally so unreliable that we do our own monitoring of external cloud resources that we depend on.


So... then what's the point of a status dashboard?


> So... then what's the point of a status dashboard?

Exactly. Apparently it's just a marketing tool if you believe parent comments...


wow, so much to their "leadership principles" , the first one being "customer obsession" and "earning trust", from what I see, this doesn't accomplish either :|


I’ve got another good FAANG principal joke:

“Don’t be evil”

buys doubleclick


ex-AWS employee here. I worked in one of the managed services (can't say which one because it might reveal my identity). In Managed services, instances are created in AWS VPC so employees have access to underlying VM.

We have used that capability to get stats about the type of customer workloads, and devised feature products based on that.


whoa...


Having worked at amazon for 3 years. I don't find it surprising. Amazon, in general, is known for not taking good care of its employees. My experience has been as SWE and its a well know fact of there that if you and your manager don't get along then you have limited time to move to a different team or company before you get PIPed.


Taking good care vs inhumane. This is just inhumane.


That’s just a severity.

People tend to be consistent in their behavior.


A better comparison would be between Azure and GCP, both of them provide office suite (SAAS) to customers.

Google recently released its side of the numbers (https://cloud.google.com/blog/topics/inside-google-cloud/how...)


From the link, Google has seen the following for Meet usage:

“Over the last few weeks, Meet’s day-over-day growth surpassed 60%, and as a result, its daily usage is more than 25 times what it was in January. Despite this growth, the demand has been well within the bounds of our network’s ability.”


AWS also provides office suite to customers: WorkMail, WorkDocs, Chime messaging, etc.


Are they even in the same league as O365 and G-suite, usage wise?

Only asking because I'd never heard of these services before


Are G-suite even in the same league as O365? I have no numbers and might be incredibly biased by location but I feel that all the "boring" office companies in Europe has moved towards Office 365 as a natural step.

That is of course not true and I'm sure there are plenty of G-suite companies but for the large ones it seems very common that they have just followed Microsoft into the cloud.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: