Author here. Long time lurker, but made an an account now.
Wow, I did not expect this. I'm really touched. I wrote this as a small utility for my own consumption because I was unsatisfied with the existing selection at the time, so I'm both surprised and delighted to learn that people are finding it useful. Although to be completely frank, I think this library is way too small and insignificant to deserve a spot on HN's front page, but it definitely made my day. So thank you kind stranger who posted it!
Op here. Thank you for creating mio! Your project clearly deserves the attention. I just found it and thought it would belong here. A lot of people seem to share that opinion :)
I am sure this won't be the last top HN post about one of your projects.
I like and took a look at the github. I think there is actually a missed opportunity here to make it a single header library since there seem to only be 4 or 5 files that go into it.
This is definitely unfortunate, but in my defense I was not aware of Rust's mio (or anything related to Rust beyond its existence) at the time of writing and naming my library. I have no emotional investment in the name, so I'm open to suggestions should anyone take issue with it.
I also think Rust's mio is sufficiently behind-the-scenes that the only people who will encounter it will already be well aware of its purpose; typical Rust users will be two levels removed from mio, if they're using it at all. Namespace collisions matter more for end-user software than foundational libraries. :)
There is also boost interprocess which, despite the name, provides a very general way to access shared memory; in particular it provides shared memory allocators and a full reimplantation of the stl that can take advantage of them.
I've always wanted to try irt, but it can't handle unexpected process failure (i.e. a crashed process will leave the memory in an unknown state) which is something I always end up needing.
It's a valid point, I think. I'd probably also trust something so established as Boost more than some random guy's lib on GitHub. However, I specifically wrote mio because I prefer not to use Boost, and from what I understand, many others don't either.
Creating a new memory mapping can be pretty expensive! On both Windows and Linux, it involves taking a process-wide reader-writer lock in exclusive mode (meaning you get to sit and wait behind page faults), doing a bunch of VMA tree manipulation work, doing various kinds of bookkeeping (hello, rmap!) and then, after you return to userspace, entering the kernel again in response to VM faults just to fill in a few pages by doing, inside the kernel, what amounts to a read (2) anyway!
Sure, if you use mmap, you get to look at the page cache pages directly instead of copying from them into some application-provided buffer, but most of the time, it's not worth the cost.
There are exceptions of course, but you should always default to conventional reads.
Another problem with mmap is that there is no good way to handle I/O errors.
The application will get a signal (SIGSEGV/SIGBUS, can't remember), and no information about what the problem could possibly be. Most applications do not catch these signals and will instead just terminate.
Even if you do catch the signal there is a real challenge to know what caused the signal and to keep consistent book-keeping to be able to perform any sane action in response.
At a previous employer we started seeing this problem when scaling things in production which was no fun.
I wanted to help address this problem with my signal sharing API proposal. Unfortunately, the glibc people have the attitude that nobody should be using signals, and so they refuse to improve the signals API at all. I strongly disagree.
There's a reason for that---signals are horribly abused as a general purpose signaling mechanism. Originally they were for signaling of actual problems in the code (SIGSEGV, SIGILL) but later signals were not (SIGWINCH I'm looking at you!).
I read the page you linked to, and just off the top of my head, trying to manage paging by catching SIGSEGV is how do you determine that it's in response to a real bug (say, dereferencing an undefined pointer)? In my opinion, by the time you get a SIGSEGV, you can't trust the program at all. While it might be nice to have a process handle page faults itself, I think a better API than signal() is required.
To distinguish the SIGSEGVs that represent crashes from ones representing faults you care about, you look at the fault address. It works perfectly well: every high performance Java or C# runtime on Linux (e.g., HotSpot, ART) does it. You can trust the program, because SIGSEGV delivery isn't magic.
As I detailed in the doc and on libc-alpha, you really do need some kind of synchronous exception mechanism to match how real hardware behaves, and it would behoove libc authors to make this mechanism not suck instead of pretending that synchronous faults would just go away.
Case in point, a sort-of general purpose IO library was using mmap as the OS interface, and presenting a read/write style API. We had various bugs related to this (e.g. extending or truncating a mmap'ed file is slightly tricky to do correctly etc.).
I ripped out the mmap codepath, fixing the mmap-related bugs, and for some benchmarks performance improved by a factor of 20.
Now, there are certainly places where mmap is awesome. E.g. if you can push the mmap semantics up to the application level, or you need the sharing semantics etc.
This is a valid point. My use case was very frequent reads of large files at pretty much unpredictable positions, so in theory mmap seemed justified. However, I never got around thoroughly testing this assumption, and may indeed just have been better off using read(2) and its variants.
You seem very experienced, so I hope you don't mind a question. In my use case the files were as large as tens of gigabytes and I was creating read-only mappings of 256KB-1MB chunks in them, keeping the mmap handles around according to a cache policy and RAM usage limit. Do you think in this case using mmap could in theory introduce performance gains?
I think that this is the wrong way to use mmap. Just map the whole file at once. The operating system will automatically read the pages you access from disk. And if memory gets tight, these pages will be flushed to disk if they are dirty and then discarded before the system starts paging. These mmapped pages essentially live in the disk cache.
> The operating system will automatically read the pages you access from disk. And if memory gets tight, these pages will be flushed to disk if they are dirty and then discarded before the system starts paging.
You can tell that you understand how modern OS memory management works when you realize that the OS "automatically read[ing] the pages...from disk" and "flush[ing them] to disk" on memory pressure is paging whether those pages are anonymous pages or mmaped file pages. :-)
[Edit: flushing dirty file-paged pages is analogous to swapping anonymous memory to the swapfile. Discarding clean file-backed pages is a bit like discarding anonymous pages that have been made unused through munmap, process death, etc.]
But to the GP's point: you don't need (except to conserve address space) to limit file mapping size. I think he really wants something like MADV_FREE. But it's complicated.
Therr is a subtle difference between anonymous and mapped readonly pages: the later can be discarded right away because their contents were read from permanemt storage to begin with. Anonymous pages need to be written to disk first and that is significantly slower.
Random access to large files is a legitimate use case! LMDB [1] uses a similar technique, and it works well for them. But depending on the specific application, explicitly application-managed caching via O_DIRECT IO with something like threaded pread or AIO might be even better, because with this explicit model, you control the cache sizing and eviction policy, and it's certainly possible with application-level knowledge to do better than the generic kernel-level LRU/active/inactive/kinda-sorta-works-heuristic stuff can do without application-specific knowledge.
Another advantage of using application-managed caching is the ability to take advantage of things like huge pages (which can drastically reduce TLB miss rates), whereas with conventional mmap of conventional files, you're limited to regular 4kB (or whatever) small pages and associated management overhead. (There's no reason in principle filesystems can't use huge pages for page cache, but AFAIK, nobody does it yet.)
OTOH, kernel management of page cache allows for better integration of cache eviction with system memory pressure signals and allows for multiple users of a single file to share the memory mirroring the contents of that file.
> Do you think in this case using mmap could in theory introduce performance gains?
It depends. The right approach depends on a lot of factors, including workload and developer complexity budget. It's funny, really: the more experience you get, the less likely you are to say "$SOLUTION is the bestest evar!" and the more often you say "well, it really depends, so I can't give you an answer".
What really strikes me as needless is someone using mmap to read a 10kB ~/.myapplication.lol.ini file or something.
Oh, I missed this response (still a little overwhelmed). I'm so glad I checked again because there is some precious wisdom in there. Thank you for taking the time to write it down. And it indeed seems like I was misusing mmap...well, next time I'll know better!
The cost of the second kernel trip (on the first page fault) is often mitigated by speculative read-ahead, or the fact that a given page is often in the UBC already. And file-backed memory doesn't contribute to dirty memory. And mmap() makes it easy to use read-only memory, which catches memory corruption bugs. Plus it's easier to use huge pages, which reduces TLB pressure.
Bit of a plug here. But I used this library in a small side project (C++ publisher-subscriber library) and it worked like a charm. There are a few things you can customize with mmapping but most use cases will do fine with this. Great work.
I wonder if this could replace the cross platform memory mapping in LibreOffice. This is part of the Operating System Layer (OSL) which is at least several decades old and uses a C interface.
That would not be a good idea at all. The memory mapping works cross platform to prevent an unsubstantiated configuration of the elon-burrow mechanisms.
Yes because there is no standardised build system for C++, which makes integrating non-header only libraries a pain (or at least somewhat more painful), especially if your aim is cross platform code. In that case you can not rely on a reasonable package manager being present and will have to essentially include all your dependencies in the build, this is trivial for header only libraries.
Overstated problem. It's not that hard under most circumstances to build a static lib.
I think there are two issues. In C++, a header only library makes sense due to templates, which don't produce object code before specialization. In C, I have seen the same with entirely macro-based "libraries" that are really simulating templates in order to make abstract data structures.
The second case though, my theory is that there is a new generation of programmers who don't understand the traditional compile and link phases because they are used to other languages. It doesn't fit their expectation of how it should work so they bend the tool to their expectation instead of figuring out the old way. The very fact that people on here are saying a programming language should have a "standard package manager" demonstrates the cultural divide: this sounds a little nutty to an old time C person.
As newer generation C++ programmer, can confirm I'm guilty of writing things header only when they "probably shouldn't be". But it's not so much I don't understand the old way. Having everything in one file is just so much easier to work with. The price you pay for it is relatively low(for small projects like this library).
The advantage is how easy they are to integrate - just put the file in your tree and then #include. No build option or compiler issues.
The disadvantage (and the reason that I'm personally not a fan) is that they bloat compile time. If it's a library you compile it once, and then it gets linked repeatedly, but a header only library is compiled every time any CPP file that includes it is compiled.
“Compiled every time” is not really true if the header correctly contains “#pragma once” or #ifdef guards.
It may be that you have used a lot of template-based headers, which may compile nearly every time because they are literally creating new code every time a new combination of template parameters is given.
It'll be compiled once for every Cpp file that includes it. Pragma once (or ifdefs) means that it only gets included once for that compilation, and has no effect on any other Cpp files.
You are correct that I'm mostly talking about template heavy header files. There is a strong correlation between template based header files and header only libraries. The matter) latter generally means the former.
> It'll be compiled once for every Cpp file that includes it.
but are you going to mmap stuff in all your .cpps ? Most of the time when I use an external library, it does not get out of a single implementation file, so it being header only does not really make it worse
You still have to compile the library each time you recompile whichever source file includes it. If the library had its own source file, you would only have to compile once and link it thereafter. This adds up if you have a lot of header-only libraries in your code base, and the total LoC may very well dwarf your own codebase. This is where precompiled headers come in, but introduce their own downsides.
Setting up the dependency infrastructure is a bit of work in C++. Recently things have been moving forward, with tools like Cmake+Conan or vcpkg becoming more popular.
Once the infrastructure is sorted out, header-only has mainly disadvantages, like making it harder to separate interface from implementation.
This one also serializes a representation of the type structure of recorded data so that it can be safely concurrently mmapped and read either with the same code or with the generic PL/compiler that I've developed in this hobbes project.
Where unpredictable disk latency is a problem, we've got a similar header-only lib for logging into shared memory (then have another process to consume this shared memory ring buffer and dump it to disk for concurrent querying):
This pipeline works well for having lightweight C++ processes feeding large volumes of data to generic query processes that we can run out of band to look at this data in various ways (with a Haskell-like query language).
We did hit a slight problem doing things this way that the straightforward representation of data (as in memory) for some cases just used too much space and too much time wasted in I/O. Basically for complex market data, where data structures aren't trivial and recording ~100GB/day makes it very awkward to keep around a few weeks of data for random querying.
So I also made this header-only lib to write data into these mmapped files with a simple compression method (I like to describe it as generalizing Curry-Howard to probabilities) that gives us much better throughput, much smaller files, faster query times, and still support concurrent constant time random access queries:
It gives us compression ratios about the same as EOD gzip, but much faster and importantly works online and with these query use-cases we have with hobbes.
Anyway, maybe I should write up those details somewhere else, I just mean to say that this is a useful technique and you can push it very far and do many things with it in a very straightforward way.
Why, why do they make it header only? Is it so difficult to integrate a couple source file along with the existing headers?
We should not forget compilation times. A project I use depends on spdlog, a header-only C++ logging library. The thing adds almost two seconds per compilation unit to single threaded build times. And since logging is kinda used everywhere, the whole project takes forever to build (trice the build time it would have had without spdlog, I've measured).
What benefit is so great that it is worth killing compilation times?
Proper "stb-style" header-only libs split the header into 2 parts: the declaration part that's always included and parsed but only contains the public API declarations, and the implementation part inside an #ifdef XXX_IMPLEMENTATION section which is only included and parsed in a single source file.
The same idea can also be used for C++ headers (unless it's all template interfaces).
With such stb-style headers, you can even get better compile times, because you can include all header-libs into a single implementation source file, giving the same advantages as unity-builds (merging all sources into one file).
I once tried to compile nheko, a Matrix client written in C++ and Qt. It makes heavy use of boost and compiling it crashed my system when I tried to compile using 2 jobs or more because it would eat up all my ram (6GB available at that time). Factor in the 2 seconds per compilation unit and it takes ages to compile. I managed to compile the native webrtc library with 8 jobs faster than this bare bones chat client on the same computer.
Here is why I used a header only library for my last C++ project.
Earlier this year I wrote a Win32 program to solve an obscure issue for my employer.
My main concern was making it possible for programmers with limited knowledge of C++ to maintain the application. We are a C# shop and I want my coworkers to be able to maintain the application after I leave.
Here are some of the other things I did to make life easier for the maintenance programmer.
I documented how the program works, what APIs it calls, and added links to Pluralsight courses on ATL and COM. The program uses printf format strings so there is a link to a tutorial on them.
I added extensive logging with line numbers for every action that the program takes. You can reconstruct the control flow by looking at the log file. Yes, I tried doing this.
I picked a header only logging library so that it would be easier for a maintenance programmer to update it. All they have to do is update the submodule that contains the library.
The code seems clean. I'm not sure this is a great idea in practice, though. Generally the only good reason for mapping stuff out of the filesystem is performance, and VM behavior with mmap() varies wildly across systems (and filesystem backends, and drivers if it's a hardware device, and hardware if it's a framebuffer, and...). Frankly on windows this is AFAIK a mostly-unheard-of technique. No one does mapping.
This just isn't really something that can be cleanly abstracted to do what you want it to do, even if you can make the code "look" the same.
But again, it looks like a nice, clean, modern C++ library. Just IMHO misapplied.
> "Frankly on windows this is AFAIK a mostly-unheard-of technique. No one does mapping."
Unheared by you.
In a world consumed by Electron apps and Javascript, lots of people couldn't care any less about performant IPC, sharing data across processes, and multiprocessing in general.
> Frankly on windows this is AFAIK a mostly-unheard-of technique. No one does mapping.
This must be incorrect.
Jeffrey Richter's advanced windows, a very widespread win32 book, introduced this technique to a LOT of developers, just when windows was the hot stuff for programmers. I don't think I've seen a win32 code base >100 Kloc without mapping. Understanding and using it is one of those rites of passage that marks the switch from junior to medior win32 developer.
MMapping is used in most fault tolerant software as a simplified method of data persistence. Granted, not the only method in play, but it is a common data safety net.
Flushing behavior is precisely the kind of thing that varies the most between systems. I'd be very suspicious of a "fault tolerant" system that tried to use a library like this to be "cross platform". That's almost a contradiction in terms.
Wow, I did not expect this. I'm really touched. I wrote this as a small utility for my own consumption because I was unsatisfied with the existing selection at the time, so I'm both surprised and delighted to learn that people are finding it useful. Although to be completely frank, I think this library is way too small and insignificant to deserve a spot on HN's front page, but it definitely made my day. So thank you kind stranger who posted it!