I just do non-blocking waitpid in a loop (until nothing is returned) after I get signal via signalfd.
Not sure why this would not work for the OP or miss some process terminations due to merged SIGCHLD signals? I guess using pidfd may be more straightforward, but I don't think this was some unsolved problem previously.
I mention that in a comment in the source, it wouldn't quite work because I intend this to work as a library and reaping children I didn't spawn would be problematic. Also mentioned in the source is that I could do a non blocking waitpid on every child I have, that just strikes me as slow and not clean code.
Of course if I want to deploy code like this in any serious way I'll have to implement another solution than non blocking pidfds...
Not sure why this would not work for the OP or miss some process terminations due to merged SIGCHLD signals? I guess using pidfd may be more straightforward, but I don't think this was some unsolved problem previously.