If the length field is 4 bytes then only 3 bytes are "wasted" compared to C with...

yongjik · on Dec 10, 2021

You are thinking as a developer in 2020s, not one in 1990s (or earlier). Memory was incredibly precious: 16-bit x86 had 64KB segments, so if your data didn't fit in, it would be a lot slower. People used nibbles (4 bits) because the extra instructions dealing with bit twiddling was worth the cost.

Basically, no sane programmer in the 90s would be happy with a string type that wasted three bytes per object.

masklinn · on Dec 11, 2021

I think one of the other factors (I fear calling it "mitigating") is that fixed-width strings were a lot more common back in the days. Outside of serialization they're pretty much gone now, but we can see their mark in various oddball "string" functions of the C standard library which were never designed to operate on "C strings" (though strings.h also has a lot of functions which are just plain garbage with no redeeming features).

For instance strncmp and (especially) strncpy make very little sense with C strings, but make sense for NUL-padded strings.

Jtsummers · on Dec 10, 2021

CPU cycles were also incredibly precious. It's a tradeoff. In the 80s and 90s you also had smaller caches so iterating over a string to determine its length was more expensive (more likely to hit RAM) than "just" reading its length parameter and carrying on with your life.

yongjik · on Dec 10, 2021

Sure everything was more expensive, but not by the same factor. Main memory was smaller but also relatively faster compared to CPU. Search for "386 simm memory" and you'll see 60ns modules. Considering that 386 debuted with 12MHz clock, 60 ns is faster than one CPU clock cycle!

In other words, "reading the whole string from memory" could be a performance problem, but a less serious problem for machines of those days, compared to using a few more bytes to store the length.

fiedzia · on Dec 10, 2021

> Basically, no sane programmer in the 90s would be happy with a string type that wasted three bytes per object.

In general, maybe, but we are talking about Borland here, so business logic apps mostly. String size is not a problem there.

musicale · on Dec 13, 2021

> Basically, no sane programmer in the 90s would be happy with a string type that wasted three bytes per object.

Though it was eventually eclipsed by C/C++/Objective-C on Apple platforms, I believe Pascal was the original application programming language for the Apple Lisa and Macintosh, and produced some revolutionary software in the 1980s.

Object Pascal/Delphi certainly enjoyed a fair amount of success in the 1990s.

pjmlp · on Dec 10, 2021

Yet JOVIAL, NEWP, PL/I, PL/S, PL.8 among other Algol dialects managed it.

inkyoto · on Dec 11, 2021

> JOVIAL, NEWP, PL/I, PL/S, PL.8 among other Algol dialects managed it.

… with all of them having been developed and run on 32-bit IBM mainframes and other big irons with «lots» of memory (e.g. 512 kB of RAM would have been considered huge in early 70-s).

C, on the other hand, was developed with limitations of early PDP-11's in mind that were often equipped with 56kB of RAM, so null-terminated strings in C was a rationalised design decision and/or a trade-off. Besides, both, UNIX and C started out as a research and a professional hobby project, not a fully fledged commercial product.

Since the internetworking had been inexistent slightly less than entirely, remote code execution that stemmed from buffer overruns was not an issue, either.

pjmlp · on Dec 11, 2021

You can start with IBM 704 used for Fortran and Lisp in 1954, TT 465L Strategic Air Command Control System in 1960, B5000 in 1961, CDC 6600 in 1964, and then compare with the capabilities from a 1964 PDP-7.

Read the DoD security assessment on Multics, https://multicians.org/b2.html

Afterwards you can read from Denis' own words,

> Although we entertained occasional thoughts about implementing one of the major languages of the time like Fortran, PL/I, or Algol 68, such a project seemed hopelessly large for our resources: much simpler and smaller tools were called for. All these languages influenced our work, but it was more fun to do things on our own.

Taken from https://www.bell-labs.com/usr/dmr/www/chist.html

So thanks to their fun, the world now suffers from C strings.

inkyoto · on Dec 11, 2021

> Although we entertained occasional thoughts about implementing one of the major languages of the time like Fortran, PL/I, or Algol 68, such a project seemed hopelessly large for our resources: much simpler and smaller tools were called for […]

Precisely my point. The definition of fun is up for a personal interpretation.

jstimpfle · on Dec 10, 2021

Scanning for the string length (e.g strlen()) is asymptotically worse than reading a fixed size integer, so obviously don't do that unless it's a good memory/speed tradeoff (i.e. when you know the string is at most say, 16 bytes long).

Overall, it seems you didn't read my comment either. Or was I _that_ unclear?