Hacker News new | past | comments | ask | show | jobs | submit login
STXXL: Standard Template Library for Extra Large Data Sets (sourceforge.net)
67 points by wslh on Feb 11, 2012 | hide | past | favorite | 5 comments



This is a new discovery on my side. Based on a SO answer.

Not just about the usefulness of this library but the theoretical and practical aspects of the research. This tutorial clarifies the goals in the first pages: http://algo2.iti.kit.edu/dementiev/files/stxxl_tutorial.pdf


"The objectives of STXXL project (distinguishing it from other libraries):

• Make the library able to handle problems of real world size (up to dozens of terabytes).

• Offer transparent support of parallel disks. This feature although announced has not been implemented in any library.

• Implement parallel disk algorithms. LEDA-SM and TPIE libraries offer only implementations of single disk EM algorithms.

• Use computer resources more efficiently. STXXL allows transparent overlapping of I/O and computation in many algorithms and data structures.

• Care about constant factors in I/O volume. A unique library feature “pipelining” can half the number of I/Os performed by an algorithm.

• Care about the internal work, improve the in-memory algorithms. Having many disks can hide the latency and increase the I/O bandwidth, s.t. internal work becomes a bottleneck.

• Care about operating system overheads. Use unbuffered disk access to avoid superfluous copying of data.

• Shorten development times providing well known interface for EM algorithms and data structures. We provide STL-compatible2 interfaces for our implementations."


http://algo2.iti.kit.edu/stxxl/trunk/FAQ.html

"STXXL container types like stxxl::vector can be parameterized only with a value type that is a POD"

Unfortunately this is a significant constraint that limits the usefulness of this library.


But I think if you are using, for example, Python you can serialize an object and store it as an string.


This looks really interesting. Thanks for posting it!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: