Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What you wrote about the design flaws of select is correct but a feed set with 10,000 sockets is just 1250 bytes or 157 longs and hardware is very good at sequential memory access. I agree it will be slower than kqueue or epoll for long-lived connections because there is more work to do for the kernel and application but for many applications select is good enough when used correctly.

I have my doubts about select causing 100% cpu utilization, I suspect you were doing other suboptimal things as well. The sample code I wrote is running well with both 100 and 10,000 connections. I have my own anecdotal evidence where application was barely handling just 100 mostly inactive connections and OPS guys suggested limits of just 50 connections per application. After I fixed how feed sets were created and how the result of select was processed the same application was running just fine with 8,000 connections. We had to support Linux, Solaris, AIX and HP-UX at that time and select/poll were available on all of them. That's why I invested time in optimizing the code instead of switching to epoll. OPS guys still suggested limit of 1,000 connections per application but this time it was for availability and other non-performance reasons.



In my experience, there's definitely a point where poll() beats select() when handling lots of connections, long-lived or short-lived. But at that point you are much better off moving to epoll() or kqueue() depending upon the OS. And if you support those syscalls, there's little point falling back to select().

For most people, if your program is handling 10s or 100s of concurrent connections, select() should be just fine.

If you are going to be handling more, it's worthwhile looking into the other syscalls to improve performance. In any case, I'd recommend abstracting away the event handling so that you can switch between different syscalls and can use benchmarking to try to replicate the traffic you want to handle. There's lots of libraries that will do this work for you, unless you need to get into the low-level stuff.

Trying to judge the syscall performance based on reasoning about amount of data transferred between your program and the OS, number of syscalls etc, is very difficult. You are much better off measuring the actual behaviour. For example, epoll() seems like a terribly designed API to me as it involves making many syscalls (whereas kqueue() is just one per loop). However, I found epoll() was very high performance. I guess the cost of syscalls on Linux can be very low in some cases.


systems calls on linux that deal with per-process state try very hard to not invalidate userland memory mappings.

The kernels real range and userland's virtual range won't overlap so for a lot of functions kernel memory just has to mapped/unmapped on call, not invalidate _all_ userland bindings.

Well okay they will overlap. So yeah your mappings may get invalidated but for synchronized higher performance systems calls they _shouldnt_.

This lets them be in the 10's to 100's of nano-seconds.

Normally the most _expensive_ part of a linux syscall is the TLB misses after one.

---

Your model of memory transfer size assumes data is being copied.

The Linux kernel has a lot of features to let userland, devices, and itself all share the same memory copy free.


This is wrong.

Kernel address space and userspace address space don't overlap at all. There's zero mucking with memory mappings, TLB etc. on syscall.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: