Efficiency is fundamentally at odds with elegance

Q: In retrospect, wasn't the decision to trade off programmer efficiency, security, and software reliability in exchange for runtime performance a fundamental mistake?

A: Well, I don’t think I made such a tradeoff. I want elegant and efficient code. Sometimes I get it. The efficiency vs. correctness, efficiency vs. programmer time, efficiency vs. high level, etc. dichotomies are largely bogus.

An interview with Bjarne Stroustrup

Unlimited-precision symbolic computation is more elegant than floating point numbers. You simply never have any numerical stability problems. Anything algebraically correct – a valid way to solve for x given the equations involving x – is also computationally correct. You don't need to know all the quirks of floating point, and you won't need "numerical recipes" which are basically ways to deal with these quirks.

Symbolic computation is rather widely available – say, in Mathematica, Matlab/maple, etc. – but it's not used nearly as much as floating point. That's because floating point is much more efficient, and a whole lot of things can not be done in a reasonable amount of time and space with symbolic computation.

It is undeniable that this efficiency comes at a cost of correctness (in terms of increased likelihood of bugs), programmer time, and elegance. There are plenty of algebraically elegant solutions which just don't work in floating point. If you don't notice, you have a bug; if you do notice, you spend your (programmer) time looking for an alternative, and said alternative may be less elegant.

Floating point is not the lowest level we can sink to. In many cases the quest for still more efficiency brings us to the dark quagmires of fixed point arithmetic. That's when you have an integer and you implicitly assume it's divided by 2^N – the exponent is statically known so the point isn't floating at run time, so to say. So to implement a+b you just use integer addition, and a*b is an integer multiplication followed by a right shift by N. Or there are midway scenarios where you have many integers with a common, dynamically computed exponent because they all have roughly the same range (FFT is one case where this is often done.)

Fixed point is so ugly that there's not even a recipes book that I'm aware of; it just doesn't come out tasty in the slightest. The biggest trouble isn't even the very likely overflow but the loss of precision: floating point guarantees a certain number of significant bits in the mantissa, while fixed point doesn't – unless you explicitly normalize the number at some point using CLZ or similar (making it more of a floating point emulation than "true" fixed point where exponents aren't represented at runtime). You think you have a 32-bit mantissa and an implicit exponent so the number is rather precise – but those 32 bits can have most of the high bits set to zero and then it's not precise at all.

However, a mixture of 8-bit, 16-bit and 32-bit fixed point operations often beats the performance of floating point by a large margin; especially if you have SIMD instructions, because you can always fit 4x more 8-bit numbers into a register than 32-bit numbers, and a single-precision floating point multiplier is ~10x more costly in hardware than an 8-bit integer multiplier so you have less of them.

Again, this efficiency comes at a cost of correctness (in terms of increased likelihood of bugs), programmer time, and elegance.

In computer vision, you're often looking for objects of a certain class, and you have a classifier taking a rectangular image region and telling whether this region contains an object of that class. A simple and elegant object detection algorithm is to apply this classifier to every possible rectangle in the image, and then remove rectangles which mostly overlap (as in, if there are 15 similar rectangles saying there's a face in roughly the same place, make one rectangle out of them all).

This elegant algorithm is never used, because there are too many possible rectangles (every coordinate times every size). A common optimization is to use a cascade of classifiers. That is, apply a very cheap classifier with a lot of false positives but hopefully almost no false negatives to every region. The purpose is to throw away most of the rectangles so that the remaining smaller set still contains all the true positives – and a lot of false positives, of course, but much less.

This is repeated with many (possibly increasingly expensive) classifiers processing increasingly smaller sets of rectangles. The most widely deployed classifier cascade is probably the Viola-Jones face detector, currently available in most digital cameras displaying little squares around faces. As you could have noticed, it often misses a face, which is to be expected with all the little classifiers hurrying to throw rectangles away. And which is OK for a consumer application where a success rate of 90-95% is perfectly fine and an extra 1% of detection rate is not worth a $0.01 increase in price. The point is that the error rate is undeniably increased by stricter efficiency requirements.

The upshot is that object detection provides a broad family of examples where, again, runtime efficiency comes at a cost of correctness (in terms of increased likelihood of bugs – there's much more code to write – as well as the ultimate detection rate), programmer time, and elegance.

(Even the smallish sub-problem of merging overlapping rectangles provides an example where efficiency has to be bought with all those other things including elegance. A short, readable, elegant solution could use an O(N^2) nested loop where each rectangle is intersected with every other rectangle. One optimization is some sort of spatial data structure where you don't look at rectangles if they don't fall into the same bin of the spatial subdivision because then they can't intersect. That's faster, more buggy and less readable.)

Does this have anything to do with the quote by Stroustrup though? His implied point was how the std::sort template is more elegant as well as more efficient than C's qsort function fiddling with void pointers, right? Or ostream vs printf? Whereas these are all examples of "algorithmic efficiency" – is that even related to language design?

Well, the thing is, "algorithms" and "languages/code" is a continuum:

Think of all the psychic energy expended in seeking a fundamental distinction between "algorithm" and "program". — Alan Perlis

Given that it's a continuum, it is doubtful that a statement which is profoundly wrong in an "algorithmic" context could be true in a "programming" context. If the tradeoff between runtime efficiency and programmer efficiency/"elegance" is fundamental from an algorithmic point of view, then it's likely fundamental in computing in general.

For a concrete example of how blurry the line between "algorithmic efficiency" and "code efficiency" is, let's discuss corner detection. The FAST corner detector is a decision tree looking at pixels surrounding the central pixel and comparing the image intensity of the center to its surroundings. Similarly to other classifier cascades, "not a corner" is a quick decision, while "yes, a corner" is decided after all the checks are done.

The decision tree is implemented in several thousands of auto-generated C code lines with gotos. (That's one addition to the recent discussion about the utility of gotos in systems programming; add computer vision to the list of goto applications, I guess.)

Is it possible to implement the decision tree in a more elegant and readable way? Of course – but at the cost of efficiency; not asymptotic efficiency since it'd be the same decision tree, but efficiency nonetheless.

Is this goto business an "algorithmic" optimization or a "program" optimization? Consider that FAST's entire raison d'etre is being faster than, say, the Harris corner detector. Constants matter for high-resolution images processed in real time.

Consider furthermore that both FAST and Harris are O(#pixels) since they look at a finite, small number of pixels around each coordinate and execute a finite, small number of operations. Consider that which is more efficient depends on the platform – SIMD helps speed up Harris but not FAST, and different SIMD instruction sets speed it up by very different factors. (This is also true for linear classifiers vs Viola Jones and for other cases.) And consider the fact that algorithmically, they're wildly different – Harris looks at eigenvectors whereas FAST is an intensity-based decision tree, they have tunable decision thresholds with different meaning, and different sets of false positives and false negatives.

So is FAST a work in the area of "algorithms" or "programming", and is the auto-generated mountain of code essential to make it efficient an "algorithm" or a "program"? My answer is that it's both, in the sense that you can't really draw the line.

But what about std::sort and C++'s combination of efficiency and elegance? Well, C++ rather obviously does pay with programmer efficiency for runtime efficiency, without an option to opt out of the deal. Every allocation can leak, every reference can be dangling, every buffer can overflow, etc. etc. etc.

This blindingly obvious fact doesn't surprise those who realize the fundamental tradeoff between efficiency and a whole lot of other things, some of which can be collectively called "elegance". Whereas those refusing to believe in such a tradeoff manage to not even notice the consequences. For example:

The relatively small size of the C++ standard library – primarily reflecting the lack of resources in the ISO C++ standards committee – compared with the huge corporate libraries can be a real disadvantage for C++ developers compared to users of proprietary languages.

…So why do languages without corporate backing which are 2 to 3 times younger than C++, such as Perl, Python, and Ruby, have so much more libraries, both standard and non-standard, but widely used?

The best uses of C++ involve deliberate design. You design classes to represent the notions of your application, you organize those classes into hierarchies, you express your algorithms precisely and abstractly (no, that “and” is not a mistake), you use libraries, you build libraries, you devise error handling and resource management strategies and express them in code. The result can be beautiful, efficient, maintainable, etc. However, it’s not just sitting down and writing a series of statements directly manipulating characters, integers, and floating point numbers.

The thing is that actually doing something useful involves a whole lot of "direct manipulations" of characters, integers and floating point numbers – and strings, arrays, hash tables, files, sockets, windows, matrices, etc. Languages which let you "just sit down and write the series of statements" give programmers the extra productivity which results in all those extra libraries getting written.

However, equally undeniably it does cost you runtime efficiency, because you pay an overhead for built-in resource management strategies such as garbage collection, built-in error detection strategies such bounds checking, and a whole lot of other things.

It's not surprising that Stroustrup sees the problem in the fact that corporations "with the resources" invest them in what he thinks is the wrong thing, presumably because of their self-interested profit motives. Alex Stepanov who designed the STL expressed similar statements, and so did Alan Kay and every other perfectionist technologist. If you seek perfection to the point of denying the existence of most obvious tradeoffs – and tradeoffs are a pesky thing for a perfectionist because they imply that perfection is unattainable – then you're also likely to somewhat resent corporations, markets, etc. For a discussion of that, see my take on Worse Is Better vs The Right Thing.

(Of course there are plenty of perfectionists who, instead of rationalizing C++'s productivity problems, spend their time denying that Python is slow, or keep waiting for Python to become fast. It will not become fast. Also, all its combinations with C/C++ designed to remedy this inefficiency will forever be ugly. We had psyco, PyPy, pyrex, Cython, Unladen Swallow, CPython extension modules, Boost.Python, and who knows what else. Python is not designed to be efficient; it's designed for productivity and for extensibility through a necessarily ugly C FFI. The tradeoff is fundamental. Python is slow forever. Python bindings are ugly forever.)

So if the tradeoff is fundamental, should we give up on efficient resource utilization? No – if the elegant thing is to load the database table into RAM, it doesn't mean that we have enough RAM. Should we give up on programmer productivity? No – inline assembly or lock-free code which isn't obviously bug-free doesn't belong in our cold paths.

We should, however, give up on perfection. Some code will be slower than we want because we don't have time to optimize it, and some code will be uglier than we want because we have no choice but to optimize it.

A hope to defeat a fundamental tradeoff is nothing but a source of frustration, and it's a bliss to have lost such a hope.

42 comments ↓

#1 pgregory on 02.08.13 at 7:45 am

Thanks a lot for the article — I like it quite a lot. But I do not agree with you.

I think you over-extend your hatred of (obviously ugly) C++ to the *ideas* born out of it — namely, generic programming, which is what I think Bjarne was talking about.

First, no one is arguing that the “perfect” hand-written code fine-tuned for hardware can beat anything including C — and that sometimes such code is needed.

But is FAST auto-generated source code essential to making it fast? No! Can this be done by compiler, at compile-time, given concise description of the decision tree? Yes. I am not talking about C++ here — I am talking about an approach, which does not compromise speed or elegance.

Mathematical numbers and computer ones are *different* domains, it should be no surprise that mathematics does translate symbol-by-symbol into efficient numerical algorithms. But within each domain we can have both efficiency and elegance.

While we should obviously give up on “immediate and ultimate perfection”, we should clearly seek that perfection — raising both the efficiency and elegance of programs.

There is no fundamental tradeoff between efficiency and elegance, given that you do not try to equate time-agnostic algorithms and reasoning (mathematics) with CS, but instead carefully choose domains and (zero-cost) abstractions.

#2 Yossi Kreinin on 02.08.13 at 8:42 am

There is no way to have efficiency or inefficiency in mathematics in the sense that there are no resources to conserve.

As to compiling decision trees: you can – that's what FAST does, the code is not done by hand. You think you can usefully add this as a "zero-cost abstraction" – as a language feature or a library and generalize it. I propose to learn from C++'s experience which hints that it will take decades, have very limited utility, and require a lot of arcane knowledge to put to any use as a library writer (see std::max, std::accumulate, and the lack of much else in the corresponding header file, as well as the story of concepts which is not over yet.)

More to the point, try to extend your approach to fixed point arithmetic or generic object classifier cascades. There very much is a tradeoff between efficiency and elegance, you very much will need to get very dirty because of it in some cases, and that's the point I was making, C++ being a tangent.

#3 DavidM on 02.08.13 at 9:42 am

Another take on this is: don't be a scientist, be an engineer.

To an engineer there aren't good tools and bad tools, there are just tools. When doing a job they find the tool which meets the needs of the task at hand.

Lets say you are building a shed.

You show a piece of wood to a scientist and ask if the wood is good enough. He says no, and immediately tells how structural integrity can be increased, weather proofing, better screws, etc.

When you ask the engineer if the wood is good enough, he just asks "good enough for what?". He looks at the problem from the perspective of the task, not the tools.

#4 Nathan on 02.08.13 at 11:02 am

I don't agree with your conclusions on Python's efficiency, but I noticed that you left ctypes out of your list of C bindings/faster Python things. If you haven't heard of it, look it up. It's as elegant a C binding as you'll get in Python.

#5 pgregory on 02.08.13 at 11:55 am

David,

My point is that engineering is impossible without science, and it is science (and its pursuit of perfection) that enables better engineering tools. Some programmers seem to ignore this, and I find it strange.

It is not easy to come up with strictly better (more elegant and efficient) languages, same as it is hard to come up with better tools and materials. But the fact that it is hard does not mean that there is a fundamental tension between them — it merely means that currently we do not see an easy way for improvement.

If you try to look at the current software engineering from a historical perspecive, it looks gross — people try to build complex things by stitching little pieces together with little overall structure, knowledge, understanding or any scientific backing. It is obvious (to me) that this is ought to change sooner or later (as happened with any other engineering discipline), and part of that change must be better languages — which do address both expressive power and efficiency.

#6 Andrew on 02.08.13 at 1:16 pm

Great article. Stroustrup is wrong and you've argued it well. I like the perfectionist explanation for his missing this obvious fact.

#7 Yossi Kreinin on 02.08.13 at 1:45 pm

@Nathan: thanks; I've used ctypes a bit. As to disagreeing with my conclusions – so do you expect a widely adopted Python implementation to become as fast as what and at what timeframe? As fast as V8 in 5 years?

I've been having these arguments since 2005 and I think you'd have to agree that for the last 8 years I've been technically right, which is not a bad long-term prediction as far as these things go. I think it's still a good long-term prediction.

#8 Yossi Kreinin on 02.08.13 at 1:54 pm

@pgregory: there very much is a fundamental tension involved and I don't see why my argument is wrong – a cascade classifier being a very good example IMO that is rather hard to argue against in isolation though it's possible to question its significance. If computing is looked at as decision-making, how on Earth can there not be a tension between the quality of the decision and the time it takes to make the decision? And if programming is decision-making – again, how can programmer time not be inversely proportionate to any desirable quality of the program, provided that such a quality requires any mental effort and the programmer's mental resources are limited?

"Science" of any kind can not fully lift limitations imposed by scarcity. I'm not saying that there can not exist something strictly improving on something else both in efficiency and elegance but I do say that you are often force to trade one for the other and I think my examples are rather good at illustrating how inescapable this is. Also – the search for better languages rather obviously costs in elegance and programmer time! Think about the cost of not having std::string for a couple of decades of C++'s existence because they waited until a perfected enough class was available (of course they then picked something rather horrible in many ways, just not the way that counted for them). Expending efforts in search of a better thing proves my point – that search costs an effort and you trade that effort for the other desirable qualities; and then sometimes – often – your search then comes up empty.

#9 Max Lybbert on 02.08.13 at 4:10 pm

Obviously, the interviewer assumed that writing code in C++ takes longer than in Python, Ruby, Java or some other preferred language; and he figured C++ programmers only accepted the slower development time in order to get efficient code. To the extent that Stroustrup's answer dodges the question, you are right.

But I think Stroustrup's answer was really meant to suggest that he doesn't care so much about development speed, and he's not a slave to efficiency either; his real concern is in what the final code looks like.

People complain that C++ is complex because of its large feature list and assume that programmers must know intimate details of those features to write efficient code; and, of course, can't imagine any reason to program in C++ other than to write efficient code. But many of those features — such as operator overloading, exceptions, or generic programming — are orthogonal to efficiency. They exist to make it easier to express ideas elegantly. Using Stroustrup's measure, I find code that uses a C++ BigInt library that overloads operators (such as https://mattmccutchen.net/bigint/ ) more elegant than code using Java's BigInteger library ( http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html ).

To be fair, many of C++'s features — such as move semantics — are included for performance gains, and some of the features that support elegant code — such as generic programming — provide for optimization possibilities and can therefore generate more efficient binaries than the alternatives (qsort vs std::sort, as you pointed out in the blog post).

Additionally, many of the features that support elegant code — such as operator overloading and generic programming — also support novel ways to write obfuscated code. Many language designers have therefore refused to put those features in their own languages. Stroustrup and the standards committee put the features in, and then try to teach people when they should be used.

Stroustrup even recognizes that sometimes elegant solutions aren't efficient ( http://www.stroustrup.com/gdr-bs-macis09.pdf , page 15 discussing one optimization: "[M]any forms of analysis … basically boil down to 'traverse the program representation doing a lot of comparisons to decide which nodes need attention'. With node sharing, those comparisons are simple pointer comparisons. Without node sharing they are recursive double dispatch operations. This difference in run-time cost is greater than a factor of 100. … The time and space gained sharing nodes should be weighted against the overhead of building and using hash tables to be able to find existing nodes when you make a new node.").

#10 Yossi Kreinin on 02.09.13 at 1:00 am

I shouldn't have mentioned C++ and Stroustrup's quote at all, I guess, or at least I should have mentioned it more briefly, because C++ really wasn't my point.

C++ and elegance – operator overloading, exceptions, etc.: all of this is available elsewhere, but nowhere else is an attempt made to actually use this stuff in an unsafe language with manual memory management. This is exactly what I'm talking about: the attempt to not make the tradeoff – to have both nicely-looking code and very efficient code, the former through the various language features you mentioned and the latter through manual everything and automatic nothing – is what basically backfires eventually, because your readable code only looks readable until you try to find that dangling reference or figure out why there are boatloads of copying operations after all despite your best attempt to overload the Big Three or the Big Five or whatever.

If Stroustrup added all those features on top of a safe core then you could argue that he's more worried about elegance than efficiency. As it is, his remark that he doesn't want the tradeoff and my comments are closer to what actually happens, I think.

BTW I'm not sure std::sort vs qsort is correct with "good" link-time optimization for a realistically large value of "good". I think the insane amount of frontend work spent on supporting all the template craziness would be better spent to implement better LTO earlier.

#11 GD on 02.09.13 at 1:11 am

Indeed As you know, I agree with on the main point 100% and saying that there can exist a language that will be as elegant/productive as python and as efficient as 'C' is a pipe dream. What I wanted to add, is that indeed 'elegance' is kind of subjective. I find/enjoy low level optimization tricks elegant, if written properly, while other will find highly abstract generalizations more to their taste. Another thing is that the 'ugly' stuff, is what makes a problem challenging in the first place, and gives us programmers employment. To make a generalization about the difference between Computer science and programming, you can say that the "scientists" are mostly interested in the O(n) notation, while the engineers (programmers) should be interested also a lot in the constant multiple hidden by that notation ….

#12 Yossi Kreinin on 02.09.13 at 1:18 am

I think that "short" is more elegant than "long" and in that sense, since a whole lot of otherwise "neat" optimizations undeniably make the code longer – sometimes much longer – they also make it "less elegant" that way. It's possible to enjoy almost everything and find beauty in almost everything to the extent that it appears meaningful and in this sense a longer, faster program may appear beautiful; but the shorter program is IMO "objectively better" modulo efficiency and it is only efficiency which makes us tolerate the longer version – if you have an infinitely fast machine, we would all prefer the shorter version and this is what I meant by there being a tradeoff. Of course those of us who routinely face resource constraints may change their aesthetics so that the shorter version no longer appeals to them, but this is a lot like a customer support person becoming genuinely nice to customers over time: it hides but does not change the forces that shaped the habit in the first place.

#13 pgregory on 02.09.13 at 3:04 am

Yossi — thanks a lot for the discussion so far. I'll try to make my point of view a little bit more clear.

Programs = Algorithms + Data structures ©. What I am arguing is that it *is* possible to have a language which enables you to describe both algorithms and data structures in such a way that description is concise and generic (pseudocode-like!), and that the runtime will be comparable to that of lower-level hand coded version.

I am *not* speaking about whether it is easy or not to come up with appropriate algorithms. What I am saying is once you have them, good language will *not* force you to tradeoff expressiveness for efficiency or other way around.

#14 gsg on 02.09.13 at 8:31 am

That's true if the "hand-coded" version only differs from the generic version in simple mechanical ways that a compiler can handle, such as parametricity over a type. If your hand-coded version is faster because you can make some useful assumption that happens to be justified in your program but does not hold generally, it isn't.

As a simple example, you can write a binary search that's faster than std::lower_bound by assuming the buffer is power-of-two sized, and exploiting the regularity in the resulting search tree.

If you want optimisations of that nature, no compiler will provide them. You are obliged to explain them to the machine, elegance be damned – which is exactly what Yossi is getting at.

#15 nksingh on 02.09.13 at 12:26 pm

I agree with the fundamental thesis that for a given class of programming (low-level, high-level, script), there's a tradeoff between elegance and speed (and features and compatibility…).

But I think it's not a zero-sum tradeoff over time. Just like constant factors matter in your image detection cases and new algorithms and implementations might allow you to produce a higher quality result given your fixed time and CPU budget, breakthroughs in Programming Language and Runtime design can make a constant-factor difference in productivity and efficiency.

If you can hide all of the nasty memory management, data access ordering, and expression optimization bits behind a good compiler and runtime, the guy writing the image processing algorithm can stay on the more elegant side of the spectrum (but maybe has to follow more rules to get efficient code). Do you count the inelegance of the runtime against the final application, assuming the runtime is shared among many applications?

Unfortunately we're stuck with the crappy local-minimum of C++, which lacks both ease of writing efficient low-level code (due to the difficulty of reasoning about the code executed by each statement) and optimizability (due to the lack of high level invariants that the compiler can take advantage of).

#16 Nelson on 02.09.13 at 12:46 pm

Low level code with overloading and generics that is easier to understand than C++… sounds like Ada. It takes a bit longer to set up due to strong typing, but increases (programmer) efficiency over time by refusing to compile likely wrong code. Also, it's just plain easier for humans to read.

#17 Zhou Feng on 02.09.13 at 2:37 pm

That is why user should program the soft in power language with easy feature and soft safe result. Visual Basic is can do same as C++ soft without strange outs. The memory corrupt issue of C++ is known well to profession developer so he chose VB to make productive result and happy coding for boss.

#18 Loup Vaillant on 02.09.13 at 2:59 pm

"Efficiency" and "elegance" are fuel for a flamewar. I think the point of this post can be neatly summarised thus:

We can try to write a program which solves a particular problem. Or we can write a program that solves the problem, *and* does it with such and such ressources constraints (time, memory, electric power…).

Obviously, the latter is harder to do than the former, because the solution space is smaller. Also, whatever quality we want our program to maximize (such as "elegance") will be lower if your program has to come from the smaller solution space.

Well, tough luck.

On the other hand, don't forget that it's often easier to optimise an initially elegant program than to write efficient code right away. The second path often lead to less efficiency than the first. (I kow a few slow C++ programs.)

Plus, a Sufficienly Smart and Insanely Complex Compiler will sweep much (but not all) inelegance under the rug, giving the user an illusion of having it both ways. See for instance modern garbage collectors, compared to a mere stop & copy one. Much uglier, yet the user won't see a thing.

#19 Yossi Kreinin on 02.09.13 at 10:43 pm

Regarding smart compilers and runtimes: there's only so much that you can expect from these; and as to zero-overhead abstractions – they are lesser abstractions.

As a simple example – an array that knows its length and compares indexes to it is a tad larger and noticeably slower. One way around it is a system trying to elide the checks and the extra space needed to keep the length; this can be a good thing in the sense of improving somewhat on the dumber system but it's not going to take all of the overhead away. Another way around it is not doing checks and not carrying length; a result is interfaces like std::transform which are clumsy (you need to manually allocate stuff before calling it) and unsafe.

Basically abstractions are often better if they have non-zero cost that can't be fully elided by a runtime system.

I'm not saying that a tradeoff means you can't improve on something existing along all dimensions – you may be able to do that if what you're improving indeed leaves room for such improvements. All I'm saying is that at some point you'll have to start making a compromise along some dimension, and here different systems will diverge, each preferring a different tradeoff.

Given that human abilities and time are rather limited, setting limited design goals from the outset is not a bad idea.

#20 Michael Moser on 02.10.13 at 12:01 am

qsort has something to do with language design: std::sort inlines the comparsion function whereas sort has to call via function pointer.

I don't get the argument, as I don't get the 'law of leaky abstractions' : abstractions are built by omission of details; sometimes the right details for your purpose are omitted and you win; sometimes the wrong details are omitted and one looses; all that doesn't mean that abstractions are bad ;-)

#21 Max Lybbert on 02.10.13 at 1:02 am

> a result is interfaces like std::transform which are
> clumsy (you need to manually allocate stuff before
> calling it) and unsafe.

I don't know where you got that impression. Taking inspiration from the Perl program Yegge mentions at https://sites.google.com/site/steveyegge2/lisp-wins (and using C++11):

int foo[] = { 1, 2, 3, 4, 5 };
std::vector bar;
std::transform(std::begin(foo), std::end(foo), std::back_inserter(bar), [] (int i) { return i * i; });
std::copy(bar.begin(), bar.end(), std::ostream_iterator(std::cout, " "));

Sure, you can reserve space in bar to avoid piecemeal allocations (bar.reserve(5)), but that's hardly a damning requirement.

> If Stroustrup added all those features on top of a
> safe core then you could argue that he's more
> worried about elegance than efficiency.

You may not be aware that in the '70s, Stroustrup worked on a computer simulation at Cambridge using what at the time was considered a supercomputer. He wrote his program in Simula, and the overhead of garbage collection almost made the whole thing fall apart. When he stared work at Bell Labs, his goal was to create something almost as elegant as Simula, but that wouldn't bring average hardware to its knees.

Clearly computers have changed a lot in forty years; and things are less computationally expensive today. And I'll be the first in line to complain about the silly quirks in C++, or for that matter any other language I regularly program in (C#, Java, Perl, Javascript, Python). But I will tell you that your central premise is wrong: elegant code does not have to be inefficient. And efficient code does not have to be ugly.

Unfortunately, programmers continue to believe that they *must* use C++'s low level features if they want incredible performance. Yes, if you malloc up 100,000 objects, you are probably not going to remember to free each of them. But you shouldn't malloc 100,000 objects. You should use some kind of strategy that makes memory manageable; which can be RAII, memory pools a la APR ( http://apr.apache.org/docs/apr/1.4/group__apr__pools.html ), garbage collection, or something else. The fact that some programmers actually do try to manage 100,000 lifetimes shouldn't be taken as proof of anything other than, perhaps, a need for more widespread knowledge that there are better ways to do it.

#22 Michael Moser on 02.10.13 at 1:56 am

If I understand things correctly then C++ has one winning point: it says that there must be a more elegant way to build compile time abstractions than abusing the c pre-processor.

I don't know if templates are really a better way, but when they work Bjarne is right, and when I have to fight template compiler errors then Bjarne must have been wrong ;-)

#23 Yossi Kreinin on 02.10.13 at 4:16 am

back_inserter… no comments.

"But I will tell you that your central premise is wrong: elegant code does not have to be inefficient. And efficient code does not have to be ugly."

I claim that elegant code often has to be less efficient and efficient code has to be more ugly – that there's a tradeoff. It's like I said that you can buys euros but that'd cost you dollars and you said you disagree that a man can't own both dollars and euros.

It'd be more interesting to discuss my more algorithmic examples before delving into the flame war about C++ which I'm sorry I've mentioned by now, since everybody seems to ignore the algorithmic examples.

What about C++'s dangling references and overflowing buffers? Isn't that at the very least a cost in programmer time and correctness if not elegance, and isn't this cost paid for efficiency?

#24 Ivan Tikhonov on 02.10.13 at 6:20 am

Fixed-point is not that much awful. All-in-all it's just an integer.

And you can made it much more friendly, if you choose your units right. If you want units of length and choose fixed-point meters, may be just use mm and get rid of point at all?

With yummy 64-bit integers you can represent 2 light years with mm precision.

#25 Ivan Tikhonov on 02.10.13 at 6:27 am

Or 460 worth of Earth equators up to nanometer.

#26 Yossi Kreinin on 02.10.13 at 8:22 am

Physical units in fixed point are not that bad. What's horrible is algebra in fixed point – such as the determinant of a matrix. This stuff doesn't really have units and it's very easy to compute utter garbage with fixed point algebra.

#27 Max Lybbert on 02.10.13 at 8:57 pm

I realize by now that I'm not going to convince you. Although your most recent reply mentioning algorithmic elegance suggests that we may not be as far apart as I originally believed.

I think one problem is that we might be using "elegance" for different concepts. In programming, at least, I don't think of elegance as a synonym for expensive or for additional layers of complexity. Quite the contrary: I define elegant as straightforward code that does what it should with minimal fuss.

By that measure, I consider much of Hakmem ( http://en.wikipedia.org/wiki/HAKMEM ) elegant, even though (or, perhaps, especially because) the code is also concerned with efficiency. Likewise, I consider Dijkstra's smoothsort ( http://www.keithschwarz.com/smoothsort/ ) an elegant response to an inefficiency in heap sort. But I can't figure out why use of a big number library would necessarily be elegant under that definition (the example given in the original post).

Your most recent post states your position is only "that elegant code often has to be less efficient and efficient code has to be more ugly – that there's a tradeoff." I agree that efficiency and ugliness are weakly correlated, but I've read enough of Dijkstra's writings to believe that, generally speaking, efficient code is ugly because the programmer in question made the wrong tradeoffs.

Personally, I write code as straightforward as I need to. Since I don't work in the kernel, that is nearly always fast enough. If I need to improve on efficiency, I don't do it by uglifying the code to shave off a microsecond here or there; I do it by finding a better algorithm. There are many efficient algorithms that can be coded elegantly in languages that aren't designed to get in your way or require significant ceremony. This approach isn't as common as I would like, and it often requires a little more upfront design than the alternatives, but it's the best way I know to avoid being embarrassed by code six months after writing it.

#28 Yossi Kreinin on 02.10.13 at 10:22 pm

"I define elegant as straightforward code that does what it should with minimal fuss."

Efficiency tends to add a lot of fuss; classifier cascades being my most "fussy" example perhaps. It is more mathematically elegant, less buggy and quicker to write when you just apply one good classifier to every coordinate, and it is a better program in every way except for being prohibitively expensive.

As to the big number library or a symbolic math library (to be able to precisely represent pi or the square root of 2, at least until they cancel out in case they do) – it's more elegant in the sense of being able to write straightforward code relying on nothing but algebraic truths. I think it's largely the same thing.

"I've read enough of Dijkstra's writings to believe that, generally speaking, efficient code is ugly because the programmer in question made the wrong tradeoffs."

Dijkstra – a perfectionist technologist if there ever was one! The same Dijkstra that was appalled by a presentation about a debugger and asked why debuggers are even desirable and how many bugs we're going to tolerate ("Seven!")

It is sometimes possible to reach something efficient and elegant by meditating for a long, long time. This is why I included "programmer time" in the discussion, as well as correctness – other qualities that Stroustrup doesn't think are at odds with efficiency but they obviously are; if you don't have the time for the long meditation needed to come up with the correct, elegant and efficient answer, then the tradeoff is still more obvious.

The difference between our opinions is quantitative, sort of – you think it's "loosely correlated", I think it's very strongly correlated; it's obviously hard to argue about these things, and one's worldview is shaped by what one is working on, what examples such a discussion brings to one's mind, and what one's standards with respect to "acceptable efficiency", "acceptable elegance", etc. are. So I certainly don't expect to convince anyone unless they happen to be on the verge of being convinced when they stumble upon my writing…

#29 Ivan Tikhonov on 02.11.13 at 4:46 am

"Physical units in fixed point are not that bad. What's horrible is algebra in fixed point – such as the determinant of a matrix."

Well, perhaps, determinant is not that good as example, as it has units – original units^n. All in all, its absolute value even have a physical meaning.

But i got your idea, joys of math for the sake of math. :)

May be thing in that you are not solving problem, you rather convert it from one system into another.

Perhaps, if determinant was originally stated in fixed point it will be quite a hassle to convert it into pure symbolic domain.

#30 Yossi Kreinin on 02.11.13 at 5:54 am

Well, it's worse than ^N because the matrix itself can be, say, moments, and then it's really hairy units. You can of course track the units through any depth of computations, I'm just saying it's becoming increasingly less useful; scaling meters by 100 is nice, determinants of matrices with moments involving meters are less nice, especially considering their rather dynamic range and your reduced ability to say, "I don't care about small precision issues" as you can when it's a meter and you say "I don't care about precision below 1 cm".

I've worked on production code in fixed point for a couple of years – very far from math for the sake of math; I know how it can get ugly.

#31 Nathan on 02.13.13 at 4:32 pm

Wow, I was about to close this tab and reread what I had wrote, and noticed that I had made a typographical error. I meant to say that I don't disagree with you on Python's efficiency. So yes, you have been right and will likely continue to be right.

#32 Alex Conley on 02.13.13 at 6:25 pm

A nice article, but I think you are significantly misinterpreting the point of numerical algorithms. Most problems worth doing flat out don't have known analytical solutions, or if they do they are written in terms of things that can't be evaluated for most arguments anything but numerically (say, as confluent hypergeometric functions). So they just can't be done analytically. If you try to do almost anything non-trivial in Mathematica, Maple, etc. you end up doing it numerically anyways — and the problem then is that they have a tendency to chose non-optimal numerical methods. Arbitrary precision arithmetic is cute, but, speaking as a scientific programmer, if somebody found a way to make it just as fast as floating point tomorrow, it would have fairly little effect on most of the stuff I do.

#33 Yossi Kreinin on 02.13.13 at 9:56 pm

Well, I might have a wrong bias from your point of view because I'm seeing a lot of stuff "worth doing" that is as simple as solving a linear equation system which very much can be done analytically but is not always trivial numerically.

Anyway – I'm not saying floating point is the source of all headeaches, just of one headache…

#34 Eldc on 02.14.13 at 1:12 pm

Great article, thank you for writing it.

#35 Craig on 04.24.13 at 12:33 am

So what about LuaJIT with it's clean C API and FFI? Not only is it much faster than any other dynamic language, it's also nearly as fast as Java and not far that away from C.

#36 Yossi Kreinin on 04.24.13 at 8:05 am

Well, I'm not saying that X can't be more efficient and elegant than Y at the same time, just that at some point, you start needing to make a trade-off.

With me, C is a scripting language; if you want to really max out what a given chip manufacturing technology gives you, you need custom processors with custom programming interfaces making even OpenCL look pretty. But even with C the scripting language and a teeny bit of intrinsics, I sort of think I have very good chances against LuaJIT where 8-bit/16-bit data types are involved, etc.

#37 mk on 05.14.13 at 10:56 pm

" The efficiency vs. correctness, efficiency vs. programmer time, efficiency vs. high level, etc. dichotomies are largely bogus."

Such a claim could be considered plausible in a world in which C++ doesn't exist … but C++ quite falsifies it.

#38 Sanjoy Das on 07.17.13 at 4:21 am

I think there is a basic economic reason for the trade-offs seen
between different desirable qualities. Solutions that are both less elegant and less efficient evolve out; strictly worse choices are already been eliminated the set of alternatives.

It is for the same reason why computer memory is either cheap or fast by the byte, or why we usually have to trade off space for time when designing algorithms.

#39 Craig on 08.29.13 at 10:36 am

"But even with C the scripting language and a teeny bit of intrinsics, I sort of think I have very good chances against LuaJIT where 8-bit/16-bit data types are involved, etc."

The LuaJIT FFI gives you full access to C data types and under some circumstances will do optimizations for you that would require tedious manual effort in C.

#40 Aristotle Pagaltzis on 08.29.13 at 5:41 pm

Here’s someone trying to prove you wrong, Yossi: On the Cleverness of Compilers.

#41 Yossi Kreinin on 08.31.13 at 3:11 am

You know, it's amazing just how ugly Lisp can be, what with all the cars and cdrs.

As to proving me wrong… He says that efficiency comes from specialization; the only question is how much of this specialization can happen automatically, and I think we can all agree that some of the special circumstances that make optimizations possible are only known to the programmer and if he doesn't describe them to the compiler, the optimization won't happen. The only question is how significant the effect of this fact is, and how high-level and "nice" these description can be.

#42 Aristotle Pagaltzis on 09.01.13 at 1:59 pm

Yes. The reason I say Alexey is trying to prove you wrong is that while the specificity vs generality dichotomy is tautologically identical to (one axis of) efficiency, efficiency vs elegance isn’t. Your thesis is that the dichotomies are still equivalent, which history so far bears out. Alexey’s work is an attempt to show otherwise – will he succeed, who knows… so I’m mentioning this as a ray of hope more than as an argument. Wouldn’t it be nice if elegance and efficiency proved to be compatible… and a pony.

Leave a Comment