"When you have a clock you always know what time it is. When you have two you are never quite certain." -Mark Twain
I've always wanted one of those watches that sync itself based on the time signals based on WWV; and have a dream that one-day all the clocks/watches in my house will show the same time. But I've resigned myself that it probably won't happen.
I used to think this quote is annoying and stupid - when you have a single clock, you always trick yourself into thinking that the time is """correct""", but when you have two you'll usually be confident within a margin of error. This is why the International Atomic Time is derived by averaging many atomic clocks, or how NTP gets time from multiple servers.
But on second thought, it can also interpreted as such: in a large synchronous system, it's often important to have a standard reference clock and to lock all the other clocks onto main clock, otherwise the system can be inconsistent due to timing problems, e.g. distributing a frequency reference in a lab, distributing the clock signal on a circuit board, or distributing standard wall clock time via NTP (the Stadium 1/2/3 hierarchy). So the quote makes perfect sense again.
Not quite the same, but I picked up a cheap GPS module with PPS output and an external antenna from AliExpress and hooked it up to a Raspberry Pi 3 running chrony.
Currently it's just set as the time server via DHCP for my computers, but I'm making a nixie clock which will use an ESP8266 running SNTP.
Almost ten years ago I recycled/converted Topfield to show time with an arduino-like board and a network adapter. Basically, it listens to NTP broadcast messages and "keeps" time in between.
Its timekeeping will suck though if it won't receive the messages periodically, but in practice it's quite fine. Except once when the ethernet board somehow broke down; fixed it by replacing it.
Sadly, it won't happen because WWV connectivity has been superseded by NTP, Bluetooth syncing, and GPS-based smartwatches. It almost stopped being a possibility when Trump tried to defund some of the NIST, but fortunately that did not come to pass.
My Garmin watch syncs to GPS satellites every time I start an activity. My cell phone (which many of my friends and family use as a replacement for a watch) syncs to the cell towers. Even appliances now are operated by microcontrollers with crystal oscillators having a frequency tolerance on the order of 20ppm: their clocks have little drift. Mine are not 'smart appliances', but those are becoming pervasive, and they, too, sync via NTP.
Ultimately, though, all those NTP and GPS clocks have their master at the NIST in Boulder, Colorado - the same clock that WWV, broadcasting from Fort Collins, 50 miles north, synchronizes against.
I would assume that all 5 give slightly differing estimates of the current time [a, b, c, d, e] with duplicates being rare and not treated specially than values that are close. You can either choose 'c' to ignore the outliers, or average [b, c, d] or some weighted average of them or more. Maybe the weighted average of the middle 3 might be more stable than the single median.
> The best master clock (BMC) algorithm performs a distributed selection of the best candidate clock based on the following clock properties: (IEEE 1588-2008 uses a hierarchical selection algorithm based on the following properties, in the indicated order:[8]:Figure 27)
> Priority 1 – the user can assign a specific static-designed priority to each clock, preemptively defining a priority among them. Smaller numeric values indicate higher priority.
> Class – each clock is a member of a given class, each class getting its own priority.
Accuracy – precision between clock and UTC, in nanoseconds (ns)
> Variance – variability of the clock
> Priority 2 – final-defined priority, defining backup order in case the other criteria were not sufficient. Smaller numeric values indicate higher priority.
> Unique identifier – MAC address-based selection is used as a tiebreaker when all other properties are equal.
Only useful if you trust the data: Class, Accuracy, and Variance (which seems to be a self-measured estimate).
Was hoping it would reference the Three Stooges gag :
MOE: How long has that been in the soup, froghead?
CURLY: About uh--- [looks at the three watches on his wrists]
MOE: Hey! What’s the idea of the three watches?
CURLY: That’s the way I tell the time.
MOE: How do you tell the time?
CURLY: This one runs ten minutes slow every two hours. This runs twenty minutes fast every four hours. The one in the middle is broken and stopped at two o’clock.
MOE: Well, how do you tell the time?
CURLY: I take the ten minutes on this one and subtract it by the twenty minutes on that one. Then I divide it by the two in the middle.
MOE: Well, what time is it now?
[Curly grabs a clock from the inside of his jacket pocket]
What exactly is a "financial accounting database"? I'm sort of surprised that there's a market for a DB targeting a very specific workflow. I would have thought that if you need that volume of specific transactions (major bank, I guess?) you'd want a full RDBMS for things other than simply the transactions, too. Basically, that Oracle would be all over that.
I love the pursuit of performance, but who would actually need 1 million journal entries a second (from their home page)? Is that the bottleneck somewhere?
Apologies for the stupid questions, I just... have no idea where this thing (which seems pretty technically neat) is actually intended to be used.
"1,000,000 journal entries per second on consumer-grade hardware." Of cause you could just add a few cents more per device hardware to not get a rock bottom clock.
The reason is because this is not in fact a substitute for the clock sync service (whether that's PTP, or something that's becoming popular like Chrony).
It's complementary, a clock fault failure detector. We're not trying to synchronize clocks, merely to detect when too many clocks in the cluster no longer agree.
TigerBeetle needs to detect when the network is partitioned in a way such that the clock sync service (PTP or NTP etc.) is not functioning while the TigerBeetle cluster still has a functioning majority on the other side of the partition.
No matter the clock sync service you use, you still need some way to verify that the clock sync is in fact happening, which is what this is.
It's a database with purpose-built primitives for recording financial transactions between accounts, particularly two-phase commit transactions between different payment systems run by different operators.
Over the past decade or so, some of our team saw that these primitives kept getting re-implemented over and over again, often with duct tape ad-hoc solutions that are not always safe, or scalable.
At the same time, the payments world is rapidly changing. Things are moving to high volume, low value digital transactions, which means that operational cost becomes important.
We extracted the design for TigerBeetle out of a real-world payment switch, so that TigerBeetle can be transplanted into these kinds of systems very easily, and in a way that's very hard to get wrong because the interface maps one-to-one to the domain.
We also solve a storage fault model that's relatively new in the research literature, something that most traditional databases were never designed for, and we do this because this makes for a safer system of record, not to mention a better operating experience.
If anything, we place the pursuit of safety much higher than the pursuit of performance. We work much harder at the safety, the performance came pretty easily with the design right from the beginning. So the safety of TigerBeetle is what we're most excited about. The performance is a nice-to-have to go with it.
These are hopefully "the things that don't change" when it comes to databases. Nobody wanted a less convenient, or more dangerous, or slower database.
To add a little backstory, TigerBeetle's clock fault failure detector described here is only for the financial domain of the state machine, not for the correctness of the consensus protocol. This has to do with financial regulation around auditing in some jurisdictions and not with total order in a distributed systems context.
We simply need a mechanism to know when PTP or NTP is broken so we can shut down as an extra safety mechanism, that’s all. Detecting when the clock sync service is broken (specifically, an unaligned partition so that the hierarchical clock sync doesn’t work, but the TigerBeetle cluster is still up and running) is in no way required by TigerBeetle for strict serializability, it’s pure defense-in-depth for the financial domain to avoid running into bad financial timestamps.
We tried to make clear in the talk itself that leader leases are “dangerous, something we would never recommend”. And on the home page, right up front (because it’s important to us) we have: “No stale reads”.
We also want a fault-tolerant clock with upper and lower bounds for timing out financial two-phase commit payments that need to be rolled back if another bank’s payments system fails. We can’t lock people’s liquidity for years on end because of a clock fault that we could have detected otherwise.
You can imagine TigerBeetle’s state machine (not the consensus protocol itself) as processing two-phase payments that look alot like a credit card transaction that has a two-phase auth/accept flow, so you want the leader to timestamp roughly close enough to true time, so that the financial transaction either goes through or ultimately gets rolled back after roughly N seconds.
Hope that clarifies the post (and gets you excited about TigerBeetle’s safety)!
P.S. We’re launching a $20K consensus challenge in early September. All the early invite details are over here [1]. Hope to see you at the live event, where we’ll have back-to-back interviews with Brian Oki and James Cowling, who authored and revised the pioneering Viewstamped Replication consensus protocol respectively.
That's a fantastic deep dive from Jane Street on all of this. One of the best internal technical interviews I've ever heard.
This talk by Jon Moore is another great talk on time, causality and synchronization, especially where cluster membership is large and dynamic: https://youtu.be/YqNGbvFHoKM