r

Buffer Overflows and String Truncation

Today I will be talking about proper string buffer handling with
strlcpy(3), strlcat(3), and snprintf(3).

Many people now know to avoid the unbounded string functions (strcpy(3),
strcat(3), and sprintf(3)), which are prone to buffer overflows:

    char newstr[9];
    const char *oldstr = "rm -rf /home/ray/tmp/*";
    /* Buffer overflow in all cases. */
    strcpy(newstr, oldstr);
    strcat(newstr, oldstr);
    sprintf(newstr, oldstr);

The bounded string functions take a buffer size argument
and never write past that boundary, preventing buffer overflows.
For arrays, the buffer size can be calculated with the sizeof operator.
For malloc(3) allocated buffers, the size parameter given to malloc(3)
can be used as the buffer size.
The following examples use the sizeof idiom,
but each sizeof instance can be replaced with the buffer size.

Note: the sizeof operator does not work with arrays passed as
function arguments:

    char real_array[BUFSIZ];
    void
    function(char array[], char array2[BUFSIZ], chat *ptr)
    {
        /* sizeof(real_array) != sizeof(array) */
        /* sizeof(real_array) != sizeof(array2) */
        /* sizeof(real_array) != sizeof(ptr) */
    }
    int
    main(int argc, char *argv[])
    {
        function(real_array, real_array, real_array);
        return (0);
    }

In the example above, a pointer to the array is passed around,
not the array.
Thus the sizeof operator returns the pointer size
instead of the array size.

The bounded variants of the previous three string functions are
strlcpy(3), strlcat(3), and snprintf(3).
These functions truncate the resulting string as necessary,
preventing overflow:

    char newstr[9];
    const char *oldstr = "rm -rf /home/ray/tmp/*";
    /* Truncation in all cases. */
    strlcpy(newstr, oldstr, sizeof(newstr));
    strlcat(newstr, oldstr, sizeof(newstr));
    snprintf(newstr, sizeof(newstr), oldstr));

strncpy(3) and strncat(3) are also bounded functions,
but they suffer from an unwieldy API.
This is how to properly use strncpy(3) and strncat(3)
to always create NUL-terminated strings and to never overflow:

    char newstr[9];
    const char *oldstr = "rm -rf /home/ray/tmp/*";
    /* Truncation in all cases. */
    strncpy(newstr, oldstr, sizeof(newstr) - 1);
    newstr[sizeof(newstr) - 1] = '\0';
    strncat(newstr, oldstr, sizeof(newstr) - 1 - strlen(newstr));

Compare the above to strlcpy(3) and strlcat(3).
The strncpy(3) and strncat(3) functions are complicated,
error-prone, and strongly discouraged.

Now buffer overflows are prevented,
but a new problem arises: truncation.
Undetected truncation can be deadly:
do you really want to execute the truncated string produced above?
To aid truncation detection,
these functions return the resulting string length
as if truncation did not occur.
If this value is greater than or equal to the destination buffer size,
truncation has occurred:

    char newstr[9];
    const char *oldstr = "rm -rf /home/ray/tmp/*";
    /* Detected truncation in all cases. */
    if (strlcpy(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr))
        warnx("truncation");
    if (strlcat(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr))
        warnx("truncation");
    if (snprintf(newstr, sizeof(newstr), oldstr) >= sizeof(newstr))
        warnx("truncation");

snprintf(3) comes from the printf(3) family and inherits all its quirks.
Because of this inheritance snprintf(3)
needs more than just truncation checks.
For example, it interprets oldstr as a format string.
Percentage signs (%) in the string can affect the result:

    char newstr[9];
    const char *oldstr = "100%off!!!";
    /* Format string error. */
    if (snprintf(newstr, sizeof(newstr), oldstr) >= sizeof(newstr))
        warnx("truncation");

To prevent this, always escape strings with %s.
This rule applies to all printf(3) functions:

    char newstr[9];
    const char *oldstr = "100%off!!!";
    /* Format string error. */
    if (snprintf(newstr, sizeof(newstr), "%s", oldstr) >=
        sizeof(newstr))
        warnx("truncation");

Additionally, the printf(3) family will return a negative value
if there is an error and must be tested:

    int i;
    char newstr[9];
    const char *oldstr = "100%off!!!";
    /* Format string error. */
    i = snprintf(newstr, sizeof(newstr), "%s", oldstr);
    if (i < 0 || i >= sizeof(newstr))
        warnx("snprintf");

The above example demonstrates snprintf(3)’s last quirk:
it takes a size_t for the buffer size but returns an int.
This means that truncation detection involves
comparing an int to a size_t.
This can be avoided by using strlcpy(3) and strlcat(3),
if your format string does nothing but string concatenation.

Here’s a recap of correct string usage:

    int i;
    char newstr[9];
    const char *oldstr = "100%off!!!";
    /* Detect all errors. */
    if (strlcpy(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr))
        warnx("truncation");
    if (strlcat(newstr, oldstr, sizeof(newstr)) >= sizeof(newstr))
        warnx("truncation");
    i = snprintf(newstr, sizeof(newstr), "%s", oldstr);
    if (i < 0 || i >= sizeof(newstr))
        warnx("snprintf");

I hope that this has been helpful,
but before submitting any patches please be sure they are correct
and solve actual problems.
Thanks!