Hacker News new | past | comments | ask | show | jobs | submit login

`strncpy` is commonly misunderstood. It's name misleads people into thinking it's a safely-truncating version of `strcpy`. It's not.

I've seen a lot of code where people changed from `strcpy` to `strncpy` because they thought that was safety and security best practice. Even sometimes creating a new security vulnerability which wasn't there with `strcpy`.

`strncpy` does two unexpected things which lead to safety, security and performance issues, especially in large codebases where the destination buffers are passed to other code:

• `strncpy` does NOT zero-terminate the copied string if it limits the length.

Whatever is given the copied string in future is vulnerable to a buffer-read-overrun and junk characters appended to the string, unless the reader has specific knowledge of the buffer length and is strict about NOT treating it as a null-terminated string. That's unusual C, so it's rarely done correctly. It also doesn't show up in testing or normal use, if `strnlen` is "for safety" and nobody enters data that large.

• `strncpy` writes the entire destination buffer with zeros after the copied string.

Usually this isn't a safety and security problem, but it can be terrible for performace if large buffers are being used to ensure there's room for all likely input data.

I've seen these issues in large, commercial C code, with unfortunate effects:

The code had a security fault because under some circumstances, a password check would read characters after the end of a buffer due to lack of a zero-terminator, that authors over the years assumed would always be there.

A password change function could set the new password to something different than the user entered, so they couldn't login after.

The code was assumed to be "fast" because it was C, and avoided "slow" memory allocation and a string API when processing strings. It used preallocated char arrays all over the place to hold temporary strings and `strncpy` to "safely" copy. They were wrong: It would have run faster with a clean string API that did allocations (for multiple reasons, not just `strncpy`).

Those char arrays had the slight inconvenience of causing oddly mismatched string length limits in text fields all over the place. But it was worth it for performance, they thought. To avoid that being a real problem, buffers tended to be sized to be "larger" than any likely value, so buffer sizes like 256 or 1000, 10000 or other arbitrary lengths plucked at random depending on developer mood at the time, and mismatched between countless different places in the large codebase. `strncpy` was used to write to them.

Using `malloc`, or better a proper string object API, would have run much faster in real use, at the same time as being safer and cleaner code.

Even worse, sometimes strings would be appended in pieces, each time using `strncpy` with the remaining length of the destination buffer. That filled the destination with zeros repeatedly, for every few characters appended. Sometimes causing user-interactions that would take milliseconds if coded properly, to take minutes.

Ironically, even a slow scripting language like Python using ordinary string type would have probably run faster than the C application. (Also Python dictionaries would have been faster than the buggy C hash tables in that application which took O(n) lookup time, and SQLite database tables would have been faster, smaller and simpler than the slow and large C "optimised" data structures they used to store data).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: