This ownership stuff is such a blessing and amazing. I'm currently working on a contract where we're trying to fixup some code that crashes under heavy load, as well as troubleshoot some perf issues. The root cause is that it's not clear who actually owns what until when, so in some cases an object is destructed while there's still some code thinking it can use it. Fun.
For perf, 30% of the CPU is burned in malloc/free, due to them copying strings around. The system has an arena allocator built in, and many of these strings might be able to go in there. Except no one is sure exactly how long the lifetime is on these things. So everyone copies everything just to be sure. Rust would force addressing this kinda thing up front. (And this project coulda added an refcounted structure or something, but it's a few hundred kloc of C and C++ so...)
Have you considered using something like boehm-gc to profile the app to get a better handle on item lifetimes, leaks, etc? It seems fairly ideal for figuring out where things are going cactus on those leaky strings.
That's a cool idea. Honestly though a lot of the core is rather grotty plus there's dozens of modules loaded in any given configuration, each which can hook into anything. It's not particularly fun.. (I don't name anything because the software does work for some scenarios and is giving a free basis to what would otherwise be expensive code... And the core devs are nice so I don't want to sound ungrateful or rude.)
Uhh... you do realize C++ std::string has small string optimizations right? That would probably take care of most of the 30%. const string& would probably take care of the rest. No program should have performance problems due to strings.
As always it would depend on the context of usage whether the allocations would be problematic. Use of SGI STL::rope<T, Alloc>[0] has helped me in the past in a similar scenario. It is part of SGI stl implementation that did not make it into part of the standard library specification although it should have, imho.
Essentially ropes are character strings represented as a tree of concatenation nodes optimized for immuatability. A rope may contain shared subtrees so is really a directed acyclic graph where the out-edges of each vertex are ordered. Kind of search trees that are indexed by position.
That doesn't mean that allocations are a big part of chrome's performance, and in fact I would guess that they aren't since allocations are a large performance win and chrome is optimized.
Also, chrome being a web browser means a huge part of the program is strings.
Most of the code is C, and uses APR, but with its own wrappers. E.g. "project_sprintf" and such. Some of the strings aren't quite small enough, too. They're like HTTP header strings, unseparated. I agree no one should have this problem, but if you naïvely start passing char* around and have a complicated ownership model (that's probably technically unsound, but works in practice most of the time), you can end up duplicating and getting hurt.
std::string's small string optimization is automatic, you don't have to do anything with it or about it. Bigger strings create allocations, small ones don't, and statistically most strings are small. Also if you use const references you can be sure that you aren't modifying the strings and aren't copying them. With C++11 you can move strings and so the ownership is explicit.
But then you could reference parts of large strings, copy small sections and manipulate those, etc. Modern programs can do 10 million heap allocations per second. If heap allocations are really a problem, that is a gigantic design flaw, and if it is happening with strings, simply using std::string should make a large impact. Surely not every string passed is both huge and needs to be copied.
For perf, 30% of the CPU is burned in malloc/free, due to them copying strings around. The system has an arena allocator built in, and many of these strings might be able to go in there. Except no one is sure exactly how long the lifetime is on these things. So everyone copies everything just to be sure. Rust would force addressing this kinda thing up front. (And this project coulda added an refcounted structure or something, but it's a few hundred kloc of C and C++ so...)