Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
STXXL: Standard Template Library for Extra Large Data Sets (sourceforge.net)
67 points by wslh on Feb 11, 2012 | hide | past | favorite | 5 comments


This is a new discovery on my side. Based on a SO answer.

Not just about the usefulness of this library but the theoretical and practical aspects of the research. This tutorial clarifies the goals in the first pages: http://algo2.iti.kit.edu/dementiev/files/stxxl_tutorial.pdf


"The objectives of STXXL project (distinguishing it from other libraries):

• Make the library able to handle problems of real world size (up to dozens of terabytes).

• Offer transparent support of parallel disks. This feature although announced has not been implemented in any library.

• Implement parallel disk algorithms. LEDA-SM and TPIE libraries offer only implementations of single disk EM algorithms.

• Use computer resources more efficiently. STXXL allows transparent overlapping of I/O and computation in many algorithms and data structures.

• Care about constant factors in I/O volume. A unique library feature “pipelining” can half the number of I/Os performed by an algorithm.

• Care about the internal work, improve the in-memory algorithms. Having many disks can hide the latency and increase the I/O bandwidth, s.t. internal work becomes a bottleneck.

• Care about operating system overheads. Use unbuffered disk access to avoid superfluous copying of data.

• Shorten development times providing well known interface for EM algorithms and data structures. We provide STL-compatible2 interfaces for our implementations."


http://algo2.iti.kit.edu/stxxl/trunk/FAQ.html

"STXXL container types like stxxl::vector can be parameterized only with a value type that is a POD"

Unfortunately this is a significant constraint that limits the usefulness of this library.


But I think if you are using, for example, Python you can serialize an object and store it as an string.


This looks really interesting. Thanks for posting it!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: