* With an std::list::iterator in each of your data nodes that represents its own position in the list (this is called the "intrusive style")
* Without an std::list::iterator in each of your data nodes
If you use the (more common) latter form: whenever you have a reference to your own object, you cannot do any of the linked list operations without an O(N) penalty to go and re-find your element in the list!
i.e: Say you have a list of requests, and a timeout callback pops up with a pointer to your request, you cannot use a request pointer to do an O(1) deletion from the list. This kind of operation is what lists are for, and std::list canonically cannot do it.
Any other operation you might want to do relating to the list, given such a pointer is impossible (e.g: create a new request that is immediately between the old request and its next).
All this assumes your object is within just one list. If it is within 2 lists, you have to use an extra indirection: std::list <request * >, which makes the problem worse. Even if you do find your request via one of the lists, there is no way to get an iterator for the other list. That means you cannot do any of the list operations on the other list. Again: This is the canonical thing lists were designed for, and std::list cannot do it.
Say that to solve this, you use the intrusive style with std::list. i.e: for every list this object is a member of, you hold an iterator inside your object.
Now the onus is on your to maintain these iterators, in addition to the common list operations. i.e: If you add an element, you need to both call std::list::add, and update the iterator.
Additionally, instead of paying with 2 pointers for each node, as an ordinary doubly-linked list should cost, you have to pay with an extra pointer or two (depending on how the iterator is implemented)!
If you use multiple lists with std::list<request * >, you pay with an extra pointer yet!
So if your data structure is within 2 doubly-linked lists, instead of paying the ideal 4 pointers per item, you pay those 4 + 1(indirection) + (2 for two iterators). 75% memory overhead, ruining your cache lines.
The code will be a mess too, due to the duplicate maintenance.
Use the intrusive approach exclusively. So that you don't need to use a list::add in addition to maintaining the two iterators. You only maintain the "iterators" (now called "list heads").
EDIT: almost completely forgot that std::list also uses new to dynamically allocate elements. This effectively means the cost of adding to lists is many times greater than the simple list_add function. Even if you supply your own allocator, this is unnecessarily expensive and by default means that list::add, like remove, are both not O(1) like they ought to be.
> whenever you have a reference to your own object, you cannot do any of the linked list operations without an O(N) penalty to go and re-find your element in the list!
Can you show an example of when you actually need to do this? Because when I need to do something like this it usually means that some container other than list is more fitting for the problem.
I gave an example: a request that is in multiple linked lists. e.g one by chronological order for quick timing out of oldest requests, and one of active requests waiting on the physical wire.
Now the timeout elapsed, so you have a pointer to a request that needs to be destroyed.
In that case, you typically use the very cheap O(1) list_del on each of the lists it's in. In STL style you pay O(N) for each list it is in. Or you conclude lists are worthless and another structure should be used. But no other structure would give you the incredibly cheap O(1) add and delete you get from lists.
Indeed, coming from Prolog/Erlang-style "most everything can be represented as a tail-call with a linked-list accumulator" programming, I'm very confused about what operations the GP is talking about. Adding/removing nodes at a position other than the head? Lookup by value? If you need these, you should be using a different data structure.
* With an std::list::iterator in each of your data nodes that represents its own position in the list (this is called the "intrusive style")
* Without an std::list::iterator in each of your data nodes
If you use the (more common) latter form: whenever you have a reference to your own object, you cannot do any of the linked list operations without an O(N) penalty to go and re-find your element in the list!
i.e: Say you have a list of requests, and a timeout callback pops up with a pointer to your request, you cannot use a request pointer to do an O(1) deletion from the list. This kind of operation is what lists are for, and std::list canonically cannot do it.
Any other operation you might want to do relating to the list, given such a pointer is impossible (e.g: create a new request that is immediately between the old request and its next).
All this assumes your object is within just one list. If it is within 2 lists, you have to use an extra indirection: std::list <request * >, which makes the problem worse. Even if you do find your request via one of the lists, there is no way to get an iterator for the other list. That means you cannot do any of the list operations on the other list. Again: This is the canonical thing lists were designed for, and std::list cannot do it.
Say that to solve this, you use the intrusive style with std::list. i.e: for every list this object is a member of, you hold an iterator inside your object.
Now the onus is on your to maintain these iterators, in addition to the common list operations. i.e: If you add an element, you need to both call std::list::add, and update the iterator.
Additionally, instead of paying with 2 pointers for each node, as an ordinary doubly-linked list should cost, you have to pay with an extra pointer or two (depending on how the iterator is implemented)!
If you use multiple lists with std::list<request * >, you pay with an extra pointer yet!
So if your data structure is within 2 doubly-linked lists, instead of paying the ideal 4 pointers per item, you pay those 4 + 1(indirection) + (2 for two iterators). 75% memory overhead, ruining your cache lines.
The code will be a mess too, due to the duplicate maintenance.
The alternative is much simpler: http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/i...
Use the intrusive approach exclusively. So that you don't need to use a list::add in addition to maintaining the two iterators. You only maintain the "iterators" (now called "list heads").
So your request would look like:
To add it to a list: Given a request, you can delete it in O(1) from both lists: Easy to use and optimal memory use (exactly 2 pointers per list head).Now, given some pointer from list2, to get back a request, you use:
This is slightly-boilerplatey, so I tend to write a little wrapper: EDIT: almost completely forgot that std::list also uses new to dynamically allocate elements. This effectively means the cost of adding to lists is many times greater than the simple list_add function. Even if you supply your own allocator, this is unnecessarily expensive and by default means that list::add, like remove, are both not O(1) like they ought to be.