Bug 11223 - Incorrect use of isspace
Incorrect use of isspace
Status: NEW
Product: Samba 4.1 and newer
Classification: Unclassified
Component: libsmbclient
All All
: P5 normal
: ---
Assigned To: Samba QA Contact
Samba QA Contact
Depends on:
  Show dependency treegraph
Reported: 2015-04-18 02:45 UTC by M Welinder
Modified: 2015-04-27 12:17 UTC (History)
0 users

See Also:

Patch (1.61 KB, patch)
2015-04-26 09:24 UTC, Volker Lendecke
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description M Welinder 2015-04-18 02:45:31 UTC
isspace should not be called on arguments of type char, unless the
character value is known not to be negative.


isspace (isdigit, isalpha, ...) etc. are only defined for the special
value EOF and for values in the range of unsigned char.  In practice
that means -1 and 0-255.  Anything else is undefined behaviour.
Comment 1 Volker Lendecke 2015-04-26 09:24:56 UTC
Created attachment 10988 [details]

Does this fix it?
Comment 2 M Welinder 2015-04-26 20:43:54 UTC
Certainly, even if it is a bit overkill.
Comment 3 Volker Lendecke 2015-04-27 09:33:47 UTC
Do you have a simpler approach? We have quite some code that expects "char" instead of "unsigned char" in the callbacks.
Comment 4 M Welinder 2015-04-27 11:48:19 UTC
It's a really messy corner of the C and C++ standards.  You can do
it a bit simpler like this:

+static bool c_isspace(char c)
+	return isspace((unsigned char)c);

This is different from your version in how it handles non-ascii.  The above
respects the locale setting, yours ignores anything outside ascii range.

One can certainly argue over which one is right.  It basically comes down
to the question of whether it is correct to treat the same file differently
for two users with different locale settings.
Comment 5 Volker Lendecke 2015-04-27 11:58:04 UTC
What does the standard say about casting of a negative number in "char" representation, say -100, to unsigned char? Is that defined?
Comment 6 M Welinder 2015-04-27 12:10:28 UTC
That's well-defined per C99's section item #2.

       [#2]  Otherwise,  if  the new type is unsigned, the value is
       converted by repeatedly adding or subtracting one more  than
       the  maximum  value  that can be represented in the new type
       until the value is in the range of the new type.

(i.e., two's complement.)


On a two's complement machine -- anything not in the "ancient" section of
a museum -- char-to-unsigned-char conversion doesn't even involve run-time
code.  The bit patterns for (signed char)-100 and (unsigned char)-100 are
Comment 7 Volker Lendecke 2015-04-27 12:17:33 UTC
(In reply to M Welinder from comment #6)

Thanks -- but what does this mean for isspace? :-)

However, as you confirmed that my initial patch does solve the problem of undefined behaviour, I'll propose it for inclusion.