Coding standards: having more errors in code than code

August 20th, 2009

I ran LINT version 9, configured to report the violations of the rules in the MISRA C++ 2008 coding standard, on a C++ source file. LINT is perhaps the most famous tool for statically checking C and C++ source code. MISRA stands for the Motor Industry Software Reliability Association, mandating adherence to its coding standards throughout the automotive industry.

The source file I tried has several KLOC worth of code, and the output of the preprocessor takes about 1M – pretty normal for C++ where a "Hello, world!" program generates 3/4M of preprocessed output. The output of LINT takes 38M. That's 38x more errors than code.

We're not finished parsing this output so I'm not sure which rules cause most violations and whether they can be clustered somehow to compress the 38M into something resembling comprehensible narrative in contents and size. The only thing basic attempts at parsing revealed at this point is that the distribution of the violations is roughly geometric, with the majority of the errors reporting violations of a minority of the rules.

Therefore, my only way of conveying some insight into the MISRA rules enforced by LINT is to look at a toy example. My example will be a Hello, world program – 2 LOC or 3/4M worth of code depending on your perspective. I'll assume LINT is told to ignore standard libraries, so it will actually be closer to 2 LOC.

#include <iostream>
int main() { std::cout << "Hello, world" << std::endl; }

From this program, LINT will produce 4 error messages when configured to enforce MISRA C++ 2008:

  1. The "int" in "int main" violates an advisory rule to avoid using built-in types and instead use typedefs indicating the size and signedness of the type, such as int32_t, INT or signed32T. Many an automotive project use a mixture of 2 or 3 of these conventions, which is compliant with the MISRA guidelines and presumably results from the history of merging or integrating code bases and/or teams. (I believe that in the particular case of main, the C and C++ standards both mandate the use of int; I didn't check if you can use a typedef to spell int but I'm certain that you can't have main() return an int32_t on a platform where int is 16b. Anyway, it appears that LINT doesn't bother to special-case main() – but you can do that yourself in its configuration file or right there in the source code, as you will have to do in many other cases.)
  2. The first left shift operator violates a MISRA rule disallowing the use of bitwise shift on signed types, or so it does according to LINT, which presumably checks whether the operands are of an unsigned integral type and reports an error if they are not (the other option is that it figures an output stream or a literal character array are "signed", but I can't see how they can be unless it's a signature we're talking about rather than signedness). The MISRA rule is based on the fact that the behavior of bitwise shift is implementation-defined and thus not portable. I do believe that there does not exist a 32b machine which does not use the 2's complement representation for integers and is a target of an automotive application. A notable share of automotive applications use signed integers to represent fixed point numbers, and I believe all of them rely on the 2's complement semantics of bitwise shifts to emulate multiplication and division.
  3. The second left shift operator is reported as violating the same rule.
  4. The two left shift operators as a whole are reported to violate the rule disallowing dependence on C operator precedence. That is, in order to correctly understand this program, a reader would have to know that (std::cout << "Hello, world!") would be evaluated first and then its output would be shifted to the left by std::endl. MISRA strives to prevent confusion, based on a well-founded assumption that few programmers know the rules of operator precedence and evaluation order, and LINT enforces the rules defined based on these premises.

I hope this gives some insight on the general code/errors ratio.

1. niczarAug 20, 2009

Those are not errors in code; those are design flaws in either the spec for the standard or LINT.

1. main(int,char**) returns an int. Not an int32_t or an int16_t depending on the platform, but an int, which happens to be an int16_t, an int32_t or an int64_t depending on the platform. This is Unix semantics. The resulting value is an error code, not an integer meant to be calculated upon and subject to overflow. At worst it's a small binary set, again not subject overflow. It has been so since C and Unix exist. Ignoring it is just strange.

2. Next, ostream & operator<< (const char *) is not the bit shift operator. It is an operator alright. It does not, however, perform a bit shift. Therefore, it is not the bit shift operator. The bit shift operator is (for example) int operator<<(int).

Furthermore, the meaning of said operator is clearly defined for signed integers. Why not disallow substraction on unsigned types, while they're at it? That should prevent underflow!

4. This is simply crazy. This kind of blanket rules are completely, utterly idiotic. '<<' here is much more readable without parenthesis, esp. considering that there is no other operator that could preempt it.

And anyone who's graduated grad school, even last of his class, knows that 1+2*3 == 1+(2*3) != (1+2)*3. And anyone who's graduated from junior high knows that 1/2/3 is fishy. And what about expressions with just one commutative operator? Are we to add parenthesis to 1+2+3?

I don't know who's to blame here, if the LINT tool is really that clueless or if the MISRA standard mandates this behaviour, but I can't begin to understand how you can ignore long standing (25+ years!) C and C++ idioms. If they don't like operator overloading, why don't they remove them altogether and call it Java?

2. JamesAug 20, 2009

Is it really fair to test a standard designed without STL streams in mind against code exclusively made up of STL stream usage? Although the in thing is pretty stupid.

3. niczarAug 20, 2009

> Is it really fair to test a standard designed without STL streams in mind against code exclusively made up of STL stream usage?

Writing C++ without the STL (or a substitute) is quite silly. I have to deal with such code, and there is only one reason for it: the devs are morons and insist on reinventing the wheel ... a square, broken wheel. The number of bugs is staggering.

4. AnonymooseAug 20, 2009

@niczar in #3:
Yeah, but writing C++ with the *iostreams part of the STL* is just stupid. std::vector is great, but std::ostream (std::ofstream, std::iob, std::hex, std::dec, et cetera et cetera) is just a dangerous and bloated wrapper on top of C's FILE pointers. Stick to FILE pointers for I/O and you'll be much happier.

As for MISRA, yes, many of its rules are stupid. For example, it completely disallows the use of , which means I'm surprised your program compiled at all (since presumably #includes ). As for lint: warnings 2 or 3 are false positives, because the operator is overloaded. Warning 4 is also a false positive, because MISRA C++ Rule 5-0-2 specifically states that an unparenthesized expression is okay if all the operators in it are the same.

5. Kragen Javier SitakerAug 20, 2009

I'm astonished that of your four comments so far, the first one restates what you said in your text and the others all refer to nonexistent "STL streams".

Very amusing.

The idea of MISRA is very much to define a safer subset of C (and apparently C++). It wouldn't surprise me at all if it outlawed operator overloading, for example.

It would be interesting to see what a new "safe" programming language designed without the compromises inherent in C compatibility would look like.

6. niczarAug 20, 2009

> others all refer to nonexistent “STL streams”.

What's your point, that iostream is part of the standard C++ lib and not the STL stricto sensu?

> It wouldn’t surprise me at all if it outlawed operator overloading, for example.

Even in standard libraries?

7. Kragen Javier SitakerAug 21, 2009

> What’s your point, that iostream is part of the standard C++ lib and not the STL stricto sensu?

Yeah. It pretty much has nothing in common with the STL. Its interface is full of inheritance, which the STL shuns. It needs funky adaptors to give it an interface the STL can cope with. It was around for at least ten years before the STL was written. Ten years! That's probably longer than your entire programming career.

8. BenAug 21, 2009

the standard C++ lib is a significantly different beast from the STL in my experience.

I don't agree that it is a useless bloated wrapper around FILE* operations (or other C operations for that matter). The streams are extremely useful and important because they provide compile-time type-safety for conversions! Woot!

The use of the ',' operator should be banned. It's got its uses but they are far and few between, and lead to unmaintainable code. Similarly, operator overloading has its uses, but also is quickly abused and leads to implementations which are difficult to maintain.

Lint is a big hassle to use, I will agree with that. You have to weigh the benefits–namely, it does help you detect errors–versus the problems. It depends a lot on context. If you're writing code for the airline industry, please use lint! If you're writing an app for a mobile phone, it's really not such a big deal.

9. DarrenAug 21, 2009

> It would be interesting to see what a new “safe” programming language designed without the compromises inherent in C compatibility would look like.

It's called Ada.

10. AnonymooseAug 23, 2009

@#5:
> The idea of MISRA is very much to define a safer subset of C (and apparently C++). It wouldn’t surprise me at all if it outlawed operator overloading, for example.

FWIW, MISRA C++ 2008 does not outlaw operator overloading in general. It does specifically outlaw overloading of the unary & operator (5-3-3) and the binary && || and comma operators (5-2-11); it also puts [Halting-Problem-level unenforceable] constraints on the semantics of overloaded assignment operators (5-17-1). Please, never start your point by assuming that MISRA isn't written by a bunch of stupidheads, because if you assume that, you're building your entire argument on sand.

@#8: printf() itself is type-safe if your compiler type-checks the format string. This is done by default in GNU GCC and in any commercial compiler based on the EDG front-end. So type-safety is a boogeyman here. My biggest complaint is that C++ streams are stateful; the behavior of e.g. (std::cout << x) depends not just on the static type of x; not just on the *dynamic* type of x at runtime [due to inheritance]; but on the things which have been output to that iostream in the past [e.g., std::hex]. I could also bitch about trying to mix wide character output with "narrow" character output, but C streams have about as many problems with wchars as C++ iostreams, so that part wouldn't be fair.

@#8: You'll be glad to know that MISRA C++ does ban the comma operator outright (5-18-1). This is *in addition* to forbidding operator overloading on it.

11. Kragen Javier SitakerAug 23, 2009

Darren: Ada is not "safe" in the same way as MISRA C. SPARK may be what you're thinking of; it's a set of constraints on Ada analogous to the constraints MISRA puts on C. Ada is what caused the Ariane 5 catastrophe.

Anonymoose: Thank you for the additional information about operator overloading. I did not make any claim about the stupidheadness of the authors of the spec, either positive or negative.

12. AndyAug 26, 2009

Sitaker: IMO Ada is evidently much safer than C in a lot of ways. As a consequence SPARK and MISRA are two really different beasts.

SPARK provides, among other things, a sublanguage to implement and verify design-by-contract, MISRA-C, IIRC, doesn't.

In the MISRA-C specs the authors themselves advise the use of better-suited languages in place of C/C++, I strongly believe they were considering (SPARK)Ada.

Finally, the Ariane 5 disaster. It's not clear to me how it can be relevant to the discussion, anyway, as far as I know, it was caused by an engineering error. Programmers plugged a 16bit version of a module, used for Ariane 4, within the 32bit shuttle system and disabled the appropriate run-time check. Hence the "boom".

13. Kragen Javier SitakerSep 20, 2009

Andy: Yes, Ada omits many of C's pitfalls, but it has some of its own. Thank you for the information about SPARK. I did not know that.

I'm not sure what you're talking about with regard to "a 16-bit version of a module...within the 32bit shuttle system". The proximate cause of the disaster was an uncaught integer overflow exception, thrown by code that didn't need to be running in the first place because nothing was using its output at that point in the flight, and furthermore was using integer variables whose ranges were more appropriate to the lower G-forces in the Ariane 4.

I claim that the failure was caused by Ada in the sense that I know of no implementation of any other language that might be considered in this context (C, C++, assembly, VHDL, Pascal, etc.) that has an integer overflow exception. Instead, in all of these languages, integer overflow silently gives a mathematically absurd result, which would have been a dramatic improvement. Java's checked-exception mechanism (a failed experiment) appears to me to have been a response to Ariane 5.

(The only other language I know of that has an integer overflow exception is old versions of Python, which presumably means that ABC did the same thing.)

14. a nony mouseOct 21, 2011

niczar: while you are correct on much of what you say, the way you say it leaves much to be desired.

MISRA bans bitwise ops on signed integers because bitwise ops on signed integers are implementation dependent in C/C++. The notable case is what happens when you >> a negative number. Popular implementation choices are logical shift (zero-fill) and arithmetic shift (value divides by a power of 2). Later languages added the >>> operator to make the choice programmer explicit.

You mention "Are we to add parenthesis to 1+2+3?". Floating point math is not associative. Any "idiot" who has taken a scientific computing class, undergrad or grad, knows that (A+B)+C is not the same as A+(B+C).

15. a nony mouseOct 21, 2011

Having been an advocate of Embedded C++ at several previous positions, I was hoping for a lot more than what MISRA C++ 2008 delivered. It was too thin, and the recommendations missed way too much.

For example, the ban on C++ exceptions without further guidance (like how to do a two-phase init correctly) left much to be desired. Not giving guidance on OO anti-patterns was another. I seem to recall it missing the rule-of-thumb to "make non-leaf classes abstract", which is precisely the thing a high-safety/high-security standard should be mentioning.

At a minimum, MISRA C++ should have been a super-set of Scott Meyer's 3 books. It should have had additional gems from Sutter and Stroustrup. It sadly, did not.

What almost everyone misses when talking about MISRA is that it is a coding standards PROCESS, not (just) a collection of style rules. Dev teams are free to pick and choose and modify which style rules they follow. They just have to follow the process defined by MISRA to justify/document their deviations from the standard ones.

16. Yossi KreininOct 22, 2011

The MISRA standard document is mostly a collection of style rules, with a small preamble telling how you could deviate from rules (presumably since otherwise its acceptance would be largely impossible) and strongly advising you not to.

Have you advocated for Embedded C++ as a part of a committee working on its definition, or as a developer or manager inside an organization which you wanted to adopt that standard?



Post a comment