Hacker News new | past | comments | ask | show | jobs | submit login

Joel's article ( https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-a... ) has tons.

A classic trap for new players is that in most modern compiled languages you can add strings and get a string, but strings are in fact immutable and can't be added without making an entirely new string and disposing of the original two. This means "a" + "b" is actually a horrible way to build up a string if you have to do lots of little additions, so most languages have some other method of making a string of strings/chars (StringBuilder in Java and C#, strings.Builder in Go, etc).




I have a counterargument for you: https://www.researchgate.net/publication/221200232_Algorithm...

It's variant called rewrite rules is used in Haskell's compiler for ages and is a heart of good vector algorithms (on par with C, including SIMD C).

I can't find a paper I read in 1998 or so where successive calls to fputc were replaced with fputs and with some other rules and same approach the OpenGL code was optimized to be as fast as possible.

It is pity that research that is twenty years old was not put into C# compiler.

And for a bonus, look at program distillation: http://meta2012.pereslavl.ru/papers/2012_Jones_Hamilton__Sup...

They transform quadratic concatenation algorithm (O(n^2)) into linear one (O(n)).


> most modern compiled languages you can add strings and get a string, but strings are in fact immutable and can't be added without making an entirely new string and disposing of the original two.

In my mind, I add two strings like so:

https://gist.github.com/seisvelas/c11d200d0040a3686e47af0068...

That is, I realloc the first string to fit the second inside of it, then add the second string into the new space. But I have no idea what I'm doing. I'm posting this comment so someone can explain to me why my way is bad and why I should be creating a new string to put the others inside (which apparently is what everyone else does!)


Why are strings made immutable by default in some langs (I have seen this mainly in java and python)? Nothing fundamentally requires strings to be so. Has some analysis been done indicating most string operations in software would benefit by immutable form rather than non-immutable form?


At least in Python, string objects are widely used as keys to dictionaries or as options in functions. For speed and efficiency these small strings are "interned" so that there is only ever one instance of the same string.

Also for mutable strings you either have to allocate enough memory to fit the final result or have some kind of rope data structure. Or else you end up copying it anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: