How will you ever get the network effects needed to get sustained users with a commercial tool?
Given Git was created because BitKeeper, a commercial tool, pulled their permission for kernel developers to use their tool aren’t we ignoring a lesson there?
Uhh, to be fair, if the goal was only to recreate git from 2005, it probably wouldn't cost $17M. I'd hazard a guess that they're recreating modern git and the emergent stuff like issues, PRs, projects, etc. I've also heard that the core devs for git are essentially paid a salary to maintain git.
Literally true if it's that one guy you're talking about.
Also, you should hear Linus talk about building git himself, what he built wasn't what you know as git today. It didn't even have the commands like git pull, git commit etc until he handed development over.
Thinking it for bit it comes to "what comes after Git" and what does "Git" mean there.
To build better tool than git, probably a few months by tiny team of good developers. Just thinking of problem and making what is needed... So either free time or few hundred thousand at max.
On other hand to replace GitHub. Endless millions will be spend... For some sort of probable gains? It might even make money in long run... But goal is probably to flip it.
No he didn’t. He built a proof of concept demo in 7 days then handed it off to other maintainers to code for real. I’m not sure why this myth keeps getting repeated. Linus himself clarifies this in every interview about git.
His main contributions were his ideas.
1) The distributed model, that doesn’t need to dial the internet.
2) The core data structures. For instance, how git stores snapshots for files changes in a commit. Other tools used diff approaches which made rewinding, branch switching, and diffing super slow.
Those two ideas are important and influenced git deeply, but he didn’t code the thing, and definitely not in 7 days!
Those were not his ideas. Before Git, the Linux kernel team was using BitKeeper for DVCS (and other DVCS implementations like Perforce existed as well). Git was created as a BitKeeper replacement after a fight erupted between Andrew Tridgell (who was accused of trying to reverse engineer BitKeeper in violation of its license) and Larry McVoy (the author of BitKeeper).
He did what needed to be done. Linux similarly has thousands of contributors and Linus's personal "code contribution" is almost negligible these days. But code doesn't matter. Literally anyone can generate thousands of lines of code that will flip bits all day long. What matters is some combination of the following: a vision, respect from peers earned with technical brilliance, audaciousness, tenacity, energy, dedication etc. This is what makes Linus special. Not his ability to bash on a keyboard all day long.
Specifically, VHS had both longer recording times and cheaper VCRs (due to Matsushita’s liberal licensing) than Betamax did. Beta only had slightly better picture quality if you were willing to sacrifice recording length per tape. Most Betamax users adopted the βII format which lowered picture quality to VHS levels in order to squeeze more recording time onto the tape. At that point Betamax’s only advantage was a slightly more compact cassette.
Also to correct another common myth, porn was widely available on both formats and was not the cause of VHS’s success over Betamax.
It depends which definition of "better" you use. VHS won the adoption race, so it was better there. While Betamax may have been technologically superior, in hindsight we can say it apparently failed to address other key aspects of the technology adoption lifecycle.
So perhaps this is a regression specifically in the arm64 code, or said differently maybe it’s a performance bug that has been there for a long time but covered up by the scheduler part that was removed?
Turns out the amd machine had huge tables enabled and after disabling those the regression was there on and too. So arm vs amd was a red herring.
Of course not a nice regression but you should not run PostgreSQL on large servers without huge pages enabled so thud regression will only hurt people who have a bad configuration. That said I think these bad configurations are common out there, especially in containerized environments where the one running PostgreSQL may not have the ability to enable huge pages.
That should be obvious to anyone who read the initial message. The regression was caused by a configuration change that changed the default from PREEMPT_NONE to PREEMT_LAZY. If you don’t know what those options do, use the source. (<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...>)
Yes, I had a good laugh at that. It might technically be a regression, but not one that most people will see in practice. Pretty weird that someone at Amazon is bothering to run those tests without hugepages.
I doubt they explicitly said "I'll run without huge pages, which is an important AWS configuration". They probably just forgot a step. And "someone at Amazon" describes a lot of people; multiply your mental probability tables accordingly.
The number of people at Amazon is pretty much irrelevant; the org is going to ensure that someone is keeping an eye on kernel performance, but also that the work isn’t duplicative.
Surely they would be testing the configuration(s) that they use in production? They’re not running RDS without hugepages turned on, right?
> The number of people at Amazon is pretty much irrelevant; the org is going to ensure that someone is keeping an eye on kernel performance, but also that the work isn’t duplicative.
I'd guess they have dozens of people across say a Linux kernel team, a Graviton hardware integration team, an EC2 team, and a Amazon RDS for PostgreSQL team who might at one point or another run a benchmark like this. They probably coordinate to an extent, but not so much that only one person would ever run this test. So yes it is duplicative. And they're likely intending to test the configurations they use in production, yes, but people just make mistakes.
True; to err is human. But it is weird that they didn’t just fire up a standard RDS instance of one or more sizes and test those. After all, it’s already automated; two clicks on the website gets you a standard configuration and a couple more get you a 96c graviton cpu. I just wonder how the mistake happened.
No… I’m assuming that they didn’t use the same automation that creates RDS clusters for actual customers. No doubt that automation configures the EC2 nodes sanely, with hugepages turned on. Leaving them turned off in this benchmark could have been accidental, but some accident of that kind was bound to happen as soon as the tests use any kind of setup that is different from what customers actually get.
You're again assuming that having huge pages turned on always brings the net benefit, which it doesn't. I have at least one example where it didn't bring any observable benefit while at the same time it incurred extra code complexity, server administration overhead, and necessitated extra documentation.
It is a system-wide toggle in a sense that it requires you to first enable huge-pages, and then set them up, even if you just want to use explicit huge pages from within your code only (madvise, mmap). I wasn't talking about the THP.
When you deploy software all around the globe and not only on your servers that you fully control this becomes problematic. Even in the latter case it is frowned upon by admins/teams if you can't prove the benefit.
Yes, there are workloads where huge-pages do not bring any measurable benefit, I don't understand why would that be questionable? Even if they don't bring the runtime performance down, which they could, extra work and complexity they incur is in a sense not optimal when compared to the baseline of not using huge-pages.
The government doesn’t care? They’re a minority of the market? The vast majority of their computers didn’t have slots to put Nvidia GPUs in, and now none of them do?
An internal PCIe slot can be had in up to 16x 5.0, whereas Thunderbolt 5 maxes out at 4x of 4.0.
Plus you have another Thunderbolt controller in between the CPU and the hardware, and it takes more energy to push that many bits 1m over a cable vs a few dozen cm over traces.
Also Thunderbolt is trivially disconnected, which in many critical workflows is not a positive, but an opportunity for ill-timed interruptions. Plus I don't have to buy a fucking dongle/dock for a real goddamn slot, make room for external power supplies, etc.
How will you ever get the network effects needed to get sustained users with a commercial tool?
Given Git was created because BitKeeper, a commercial tool, pulled their permission for kernel developers to use their tool aren’t we ignoring a lesson there?
reply