Efficiency is fundamentally at odds with elegance
Q: In retrospect, wasn't the decision to trade off programmer efficiency, security, and software reliability in exchange
for runtime performance a fundamental mistake?
A: Well, I don’t think I made such a tradeoff. I want elegant and efficient code. Sometimes I get it. The efficiency vs.
correctness, efficiency vs. programmer time, efficiency vs. high level, etc. dichotomies are largely bogus.
— An interview with Bjarne Stroustrup
Unlimited-precision symbolic computation is more elegant than floating point numbers. You simply never have any numerical
stability problems. Anything algebraically correct – a valid way to solve for x given the equations involving x – is also
computationally correct. You don't need to know all the quirks of floating point, and you won't need "numerical recipes" which are basically ways to deal with these quirks.
Symbolic computation is rather widely available – say, in Mathematica, Matlab/maple, etc. – but it's not used nearly as much
as floating point. That's because floating point is much more efficient, and a whole lot of things can not be done in a
reasonable amount of time and space with symbolic computation.
It is undeniable that this efficiency comes at a cost of correctness (in terms of increased likelihood of bugs), programmer
time, and elegance. There are plenty of algebraically elegant solutions which just don't work in floating point. If you don't
notice, you have a bug; if you do notice, you spend your (programmer) time looking for an alternative, and said alternative may
be less elegant.
Floating point is not the lowest level we can sink to. In many cases the quest for still more efficiency brings us to the
dark quagmires of fixed point arithmetic. That's when you have an integer and you implicitly assume it's divided by 2^N – the
exponent is statically known so the point isn't floating at run time, so to say. So to implement a+b you just use integer
addition, and a*b is an integer multiplication followed by a right shift by N. Or there are midway scenarios where you have many
integers with a common, dynamically computed exponent because they all have roughly the same range (FFT is one case where this is often done.)
Fixed point is so ugly that there's not even a recipes book that I'm aware of; it just doesn't come out tasty in the
slightest. The biggest trouble isn't even the very likely overflow but the loss of precision: floating point guarantees a
certain number of significant bits in the mantissa, while fixed point doesn't – unless you explicitly normalize the number at
some point using CLZ or similar (making it
more of a floating point emulation than "true" fixed point where exponents aren't represented at runtime). You think you have a
32-bit mantissa and an implicit exponent so the number is rather precise – but those 32 bits can have most of the high bits set
to zero and then it's not precise at all.
However, a mixture of 8-bit, 16-bit and 32-bit fixed point operations often beats the performance of floating point by a
large margin; especially if you have SIMD instructions, because you can always fit 4x more 8-bit numbers into a register than
32-bit numbers, and a single-precision floating point multiplier is ~10x more costly in hardware than an 8-bit integer
multiplier so you have less of them.
Again, this efficiency comes at a cost of correctness (in terms of increased likelihood of bugs), programmer time, and
elegance.
In computer vision, you're often looking for objects of a certain class, and you have a classifier taking a rectangular image
region and telling whether this region contains an object of that class. A simple and elegant object detection algorithm is to
apply this classifier to every possible rectangle in the image, and then remove rectangles which mostly overlap (as in, if there
are 15 similar rectangles saying there's a face in roughly the same place, make one rectangle out of them all).
This elegant algorithm is never used, because there are too many possible rectangles (every coordinate times every size). A
common optimization is to use a cascade of classifiers. That is, apply a very cheap classifier with a lot of false positives but
hopefully almost no false negatives to every region. The purpose is to throw away most of the rectangles so that the remaining
smaller set still contains all the true positives – and a lot of false positives, of course, but much less.
This is repeated with many (possibly increasingly expensive) classifiers processing increasingly smaller sets of rectangles.
The most widely deployed classifier cascade is probably the Viola-Jones face detector, currently
available in most digital cameras displaying little squares around faces. As you could have noticed, it often misses a face,
which is to be expected with all the little classifiers hurrying to throw rectangles away. And which is OK for a consumer
application where a success rate of 90-95% is perfectly fine and an extra 1% of detection rate is not worth a $0.01 increase in
price. The point is that the error rate is undeniably increased by stricter efficiency requirements.
The upshot is that object detection provides a broad family of examples where, again, runtime efficiency comes at a cost of
correctness (in terms of increased likelihood of bugs – there's much more code to write – as well as the ultimate detection
rate), programmer time, and elegance.
(Even the smallish sub-problem of merging overlapping rectangles provides an example where efficiency has to be bought with
all those other things including elegance. A short, readable, elegant solution could use an O(N^2) nested loop where each
rectangle is intersected with every other rectangle. One optimization is some sort of spatial data structure where you don't
look at rectangles if they don't fall into the same bin of the spatial subdivision because then they can't intersect. That's
faster, more buggy and less readable.)
Does this have anything to do with the quote by Stroustrup though? His implied point was how the std::sort template is more
elegant as well as more efficient than C's qsort function fiddling with void pointers, right? Or ostream vs printf? Whereas
these are all examples of "algorithmic efficiency" – is that even related to language design?
Well, the thing is, "algorithms" and "languages/code" is a continuum:
Think of all the psychic energy expended in seeking a fundamental distinction between "algorithm" and "program". — Alan
Perlis
Given that it's a continuum, it is doubtful that a statement which is profoundly wrong in an "algorithmic" context could be
true in a "programming" context. If the tradeoff between runtime efficiency and programmer efficiency/"elegance" is fundamental
from an algorithmic point of view, then it's likely fundamental in computing in general.
For a concrete example of how blurry the line between "algorithmic efficiency" and "code efficiency" is, let's discuss corner
detection. The FAST corner detector is a decision tree looking at
pixels surrounding the central pixel and comparing the image intensity of the center to its surroundings. Similarly to other
classifier cascades, "not a corner" is a quick decision, while "yes, a corner" is decided after all the checks are done.
The decision tree is implemented in several thousands of auto-generated C code lines with gotos. (That's one addition to the
recent discussion about the utility of gotos in systems programming; add
computer vision to the list of goto applications, I guess.)
Is it possible to implement the decision tree in a more elegant and readable way? Of course – but at the cost of efficiency;
not asymptotic efficiency since it'd be the same decision tree, but efficiency nonetheless.
Is this goto business an "algorithmic" optimization or a "program" optimization? Consider that FAST's entire raison d'etre is
being faster than, say, the Harris
corner detector. Constants matter for high-resolution images processed in real time.
Consider furthermore that both FAST and Harris are O(#pixels) since they look at a finite, small number of pixels around each
coordinate and execute a finite, small number of operations. Consider that which is more efficient depends on the platform –
SIMD helps speed up Harris but not FAST, and different SIMD instruction sets speed it up by very different factors. (This is
also true for linear classifiers vs Viola Jones and for other
cases.) And consider the fact that algorithmically, they're wildly different – Harris looks at eigenvectors whereas FAST is an
intensity-based decision tree, they have tunable decision thresholds with different meaning, and different sets of false
positives and false negatives.
So is FAST a work in the area of "algorithms" or "programming", and is the auto-generated mountain of code essential to make
it efficient an "algorithm" or a "program"? My answer is that it's both, in the sense that you can't really draw the line.
But what about std::sort and C++'s combination of efficiency and elegance? Well, C++ rather obviously does pay with
programmer efficiency for runtime efficiency, without an option to opt out of the deal. Every allocation can leak, every
reference can be dangling, every buffer can overflow, etc. etc. etc.
This blindingly obvious fact doesn't surprise those who realize the fundamental tradeoff between efficiency and a whole lot
of other things, some of which can be collectively called "elegance". Whereas those refusing to believe in such a tradeoff
manage to not even notice the consequences. For example:
The relatively small size of the C++ standard library – primarily reflecting the lack of resources in the ISO C++ standards
committee – compared with the huge corporate libraries can be a real disadvantage for C++ developers compared to users of
proprietary languages.
...So why do languages without corporate backing which are 2 to 3 times younger than C++, such as Perl, Python, and Ruby,
have so much more libraries, both standard and non-standard, but widely used?
The best uses of C++ involve deliberate design. You design classes to represent the notions of your application, you organize
those classes into hierarchies, you express your algorithms precisely and abstractly (no, that “and” is not a mistake), you use
libraries, you build libraries, you devise error handling and resource management strategies and express them in code. The
result can be beautiful, efficient, maintainable, etc. However, it’s not just sitting down and writing a series of statements
directly manipulating characters, integers, and floating point numbers.
The thing is that actually doing something useful involves a whole lot of "direct manipulations" of characters, integers and
floating point numbers – and strings, arrays, hash tables, files, sockets, windows, matrices, etc. Languages which let you "just
sit down and write the series of statements" give programmers the extra productivity which results in all those extra libraries
getting written.
However, equally undeniably it does cost you runtime efficiency, because you pay an overhead for built-in resource management
strategies such as garbage collection, built-in error detection strategies such bounds checking, and a whole lot of other
things.
It's not surprising that Stroustrup sees the problem in the fact that corporations "with the resources" invest them in what
he thinks is the wrong thing, presumably because of their self-interested profit motives. Alex Stepanov who designed the STL
expressed similar statements, and so did Alan Kay and every other perfectionist technologist. If you seek perfection to the
point of denying the existence of most obvious tradeoffs – and tradeoffs are a pesky thing for a perfectionist because they
imply that perfection is unattainable – then you're also likely to somewhat resent corporations, markets, etc. For a discussion
of that, see my take on Worse
Is Better vs The Right Thing.
(Of course there are plenty of perfectionists who, instead of rationalizing C++'s productivity problems, spend their time
denying that Python is slow, or keep waiting for Python to become fast. It will not become fast. Also, all its combinations with
C/C++ designed to remedy this inefficiency will forever be ugly. We had psyco, PyPy, pyrex, Cython, Unladen Swallow, CPython
extension modules, Boost.Python, and who knows what else. Python is not designed to be efficient; it's designed for productivity
and for extensibility through a necessarily ugly C FFI. The tradeoff is fundamental. Python is slow forever. Python bindings are
ugly forever.)
So if the tradeoff is fundamental, should we give up on efficient resource utilization? No – if the elegant thing is to load
the database table into RAM, it doesn't mean that we have enough RAM. Should we give up on programmer productivity? No – inline
assembly or lock-free code which isn't obviously bug-free doesn't belong in our cold paths.
We should, however, give up on perfection. Some code will be slower than we want because we don't have time to optimize it,
and some code will be uglier than we want because we have no choice but to optimize it.
A hope to defeat a fundamental tradeoff is nothing but a source of frustration, and it's a bliss to have lost such a
hope.
Thanks a lot for the article — I like it quite a lot. But I do not
agree with you.
I think you over-extend your hatred of (obviously ugly) C++ to the
*ideas* born out of it — namely, generic programming, which is what I
think Bjarne was talking about.
First, no one is arguing that the “perfect” hand-written code
fine-tuned for hardware can beat anything including C — and that
sometimes such code is needed.
But is FAST auto-generated source code essential to making it fast?
No! Can this be done by compiler, at compile-time, given concise
description of the decision tree? Yes. I am not talking about C++ here —
I am talking about an approach, which does not compromise speed or
elegance.
Mathematical numbers and computer ones are *different* domains, it
should be no surprise that mathematics does translate symbol-by-symbol
into efficient numerical algorithms. But within each domain we can have
both efficiency and elegance.
While we should obviously give up on “immediate and ultimate
perfection”, we should clearly seek that perfection — raising both the
efficiency and elegance of programs.
There is no fundamental tradeoff between efficiency and elegance,
given that you do not try to equate time-agnostic algorithms and
reasoning (mathematics) with CS, but instead carefully choose domains
and (zero-cost) abstractions.
There is no way to have efficiency or inefficiency in mathematics in
the sense that there are no resources to conserve.
As to compiling decision trees: you can – that's what FAST does, the
code is not done by hand. You think you can usefully add this as a
"zero-cost abstraction" – as a language feature or a library and
generalize it. I propose to learn from C++'s experience which hints that
it will take decades, have very limited utility, and require a lot of
arcane knowledge to put to any use as a library writer (see std::max,
std::accumulate, and the lack of much else in the corresponding header
file, as well as the story of concepts which is not over yet.)
More to the point, try to extend your approach to fixed point
arithmetic or generic object classifier cascades. There very much is a
tradeoff between efficiency and elegance, you very much will need to get
very dirty because of it in some cases, and that's the point I was
making, C++ being a tangent.
Another take on this is: don't be a scientist, be an engineer.
To an engineer there aren't good tools and bad tools, there are just
tools. When doing a job they find the tool which meets the needs of the
task at hand.
Lets say you are building a shed.
You show a piece of wood to a scientist and ask if the wood is good
enough. He says no, and immediately tells how structural integrity can
be increased, weather proofing, better screws, etc.
When you ask the engineer if the wood is good enough, he just asks
"good enough for what?". He looks at the problem from the perspective of
the task, not the tools.
I don't agree with your conclusions on Python's efficiency, but I
noticed that you left ctypes out of your list of C bindings/faster
Python things. If you haven't heard of it, look it up. It's as elegant a
C binding as you'll get in Python.
David,
My point is that engineering is impossible without science, and it is
science (and its pursuit of perfection) that enables better engineering
tools. Some programmers seem to ignore this, and I find it strange.
It is not easy to come up with strictly better (more elegant and
efficient) languages, same as it is hard to come up with better tools
and materials. But the fact that it is hard does not mean that there is
a fundamental tension between them — it merely means that currently we
do not see an easy way for improvement.
If you try to look at the current software engineering from a
historical perspecive, it looks gross — people try to build complex
things by stitching little pieces together with little overall
structure, knowledge, understanding or any scientific backing. It is
obvious (to me) that this is ought to change sooner or later (as
happened with any other engineering discipline), and part of that change
must be better languages — which do address both expressive power and
efficiency.
Great article. Stroustrup is wrong and you've argued it well. I like
the perfectionist explanation for his missing this obvious fact.
@Nathan: thanks; I've used ctypes a bit. As to disagreeing with my
conclusions – so do you expect a widely adopted Python implementation to
become as fast as what and at what timeframe? As fast as V8 in 5
years?
I've been having these arguments since 2005 and I think you'd have to
agree that for the last 8 years I've been technically right, which is
not a bad long-term prediction as far as these things go. I think it's
still a good long-term prediction.
@pgregory: there very much is a fundamental tension involved
and I don't see why my argument is wrong – a cascade classifier being a
very good example IMO that is rather hard to argue against in isolation
though it's possible to question its significance. If computing is
looked at as decision-making, how on Earth can there not be a
tension between the quality of the decision and the time it takes to
make the decision? And if programming is decision-making –
again, how can programmer time not be inversely proportionate
to any desirable quality of the program, provided that such a quality
requires any mental effort and the programmer's mental resources are
limited?
"Science" of any kind can not fully lift limitations imposed by
scarcity. I'm not saying that there can not exist something strictly
improving on something else both in efficiency and elegance but I do say
that you are often force to trade one for the other and I think my
examples are rather good at illustrating how inescapable this is. Also –
the search for better languages rather obviously costs in elegance and
programmer time! Think about the cost of not having std::string for a
couple of decades of C++'s existence because they waited until a
perfected enough class was available (of course they then picked
something rather horrible in many ways, just not the way that counted
for them). Expending efforts in search of a better thing proves my point
– that search costs an effort and you trade that effort for the other
desirable qualities; and then sometimes – often – your search then comes
up empty.
Obviously, the interviewer assumed that writing code in C++ takes
longer than in Python, Ruby, Java or some other preferred language; and
he figured C++ programmers only accepted the slower development time in
order to get efficient code. To the extent that Stroustrup's answer
dodges the question, you are right.
But I think Stroustrup's answer was really meant to suggest that he
doesn't care so much about development speed, and he's not a slave to
efficiency either; his real concern is in what the final code looks
like.
People complain that C++ is complex because of its large feature list
and assume that programmers must know intimate details of those features
to write efficient code; and, of course, can't imagine any reason to
program in C++ other than to write efficient code. But many of those
features — such as operator overloading, exceptions, or generic
programming — are orthogonal to efficiency. They exist to make it easier
to express ideas elegantly. Using Stroustrup's measure, I find code that
uses a C++ BigInt library that overloads operators (such as https://mattmccutchen.net/bigint/ ) more elegant than
code using Java's BigInteger library ( http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html
).
To be fair, many of C++'s features — such as move semantics — are
included for performance gains, and some of the features that support
elegant code — such as generic programming — provide for optimization
possibilities and can therefore generate more efficient binaries than
the alternatives (qsort vs std::sort, as you pointed out in the blog
post).
Additionally, many of the features that support elegant code — such
as operator overloading and generic programming — also support novel
ways to write obfuscated code. Many language designers have therefore
refused to put those features in their own languages. Stroustrup and the
standards committee put the features in, and then try to teach people
when they should be used.
Stroustrup even recognizes that sometimes elegant solutions aren't
efficient ( http://www.stroustrup.com/gdr-bs-macis09.pdf , page
15 discussing one optimization: "[M]any forms of analysis ... basically
boil down to 'traverse the program representation doing a lot of
comparisons to decide which nodes need attention'. With node sharing,
those comparisons are simple pointer comparisons. Without node sharing
they are recursive double dispatch operations. This difference in
run-time cost is greater than a factor of 100. ... The time and space
gained sharing nodes should be weighted against the overhead of building
and using hash tables to be able to find existing nodes when you make a
new node.").
I shouldn't have mentioned C++ and Stroustrup's quote at all, I
guess, or at least I should have mentioned it more briefly, because C++
really wasn't my point.
C++ and elegance – operator overloading, exceptions, etc.: all of
this is available elsewhere, but nowhere else is an attempt made to
actually use this stuff in an unsafe language with manual memory
management. This is exactly what I'm talking about: the attempt to
not make the tradeoff – to have both nicely-looking code and
very efficient code, the former through the various language features
you mentioned and the latter through manual everything and automatic
nothing – is what basically backfires eventually, because your readable
code only looks readable until you try to find that dangling reference
or figure out why there are boatloads of copying operations after all
despite your best attempt to overload the Big Three or the Big Five or
whatever.
If Stroustrup added all those features on top of a safe core then you
could argue that he's more worried about elegance than efficiency. As it
is, his remark that he doesn't want the tradeoff and my comments are
closer to what actually happens, I think.
BTW I'm not sure std::sort vs qsort is correct with "good" link-time
optimization for a realistically large value of "good". I think the
insane amount of frontend work spent on supporting all the template
craziness would be better spent to implement better LTO earlier.
Indeed As you know, I agree with on the main point 100% and saying
that there can exist a language that will be as elegant/productive as
python and as efficient as 'C' is a pipe dream. What I wanted to add, is
that indeed 'elegance' is kind of subjective. I find/enjoy low level
optimization tricks elegant, if written properly, while other will find
highly abstract generalizations more to their taste. Another thing is
that the 'ugly' stuff, is what makes a problem challenging in the first
place, and gives us programmers employment. To make a generalization
about the difference between Computer science and programming, you can
say that the "scientists" are mostly interested in the O(n) notation,
while the engineers (programmers) should be interested also a lot in the
constant multiple hidden by that notation ....
I think that "short" is more elegant than "long" and in that sense,
since a whole lot of otherwise "neat" optimizations undeniably make the
code longer – sometimes much longer – they also make it "less elegant"
that way. It's possible to enjoy almost everything and find beauty in
almost everything to the extent that it appears meaningful and in this
sense a longer, faster program may appear beautiful; but the shorter
program is IMO "objectively better" modulo efficiency and it is only
efficiency which makes us tolerate the longer version – if you have an
infinitely fast machine, we would all prefer the shorter version and
this is what I meant by there being a tradeoff. Of course those of us
who routinely face resource constraints may change their aesthetics so
that the shorter version no longer appeals to them, but this is a lot
like a customer support person becoming genuinely nice to customers over
time: it hides but does not change the forces that shaped the habit in
the first place.
Yossi — thanks a lot for the discussion so far. I'll try to make my
point of view a little bit more clear.
Programs = Algorithms + Data structures ©. What I am arguing is that
it *is* possible to have a language which enables you to describe both
algorithms and data structures in such a way that description is concise
and generic (pseudocode-like!), and that the runtime will be comparable
to that of lower-level hand coded version.
I am *not* speaking about whether it is easy or not to come up with
appropriate algorithms. What I am saying is once you have them, good
language will *not* force you to tradeoff expressiveness for efficiency
or other way around.
That's true if the "hand-coded" version only differs from the generic
version in simple mechanical ways that a compiler can handle, such as
parametricity over a type. If your hand-coded version is faster because
you can make some useful assumption that happens to be justified in your
program but does not hold generally, it isn't.
As a simple example, you can write a binary search that's faster than
std::lower_bound by assuming the buffer is power-of-two sized, and
exploiting the regularity in the resulting search tree.
If you want optimisations of that nature, no compiler will provide
them. You are obliged to explain them to the machine, elegance be damned
– which is exactly what Yossi is getting at.
I agree with the fundamental thesis that for a given class of
programming (low-level, high-level, script), there's a tradeoff between
elegance and speed (and features and compatibility...).
But I think it's not a zero-sum tradeoff over time. Just like
constant factors matter in your image detection cases and new algorithms
and implementations might allow you to produce a higher quality result
given your fixed time and CPU budget, breakthroughs in Programming
Language and Runtime design can make a constant-factor difference in
productivity and efficiency.
If you can hide all of the nasty memory management, data access
ordering, and expression optimization bits behind a good compiler and
runtime, the guy writing the image processing algorithm can stay on the
more elegant side of the spectrum (but maybe has to follow more rules to
get efficient code). Do you count the inelegance of the runtime against
the final application, assuming the runtime is shared among many
applications?
Unfortunately we're stuck with the crappy local-minimum of C++, which
lacks both ease of writing efficient low-level code (due to the
difficulty of reasoning about the code executed by each statement) and
optimizability (due to the lack of high level invariants that the
compiler can take advantage of).
Low level code with overloading and generics that is easier to
understand than C++... sounds like Ada. It takes a bit longer to set up
due to strong typing, but increases (programmer) efficiency over time by
refusing to compile likely wrong code. Also, it's just plain easier for
humans to read.
That is why user should program the soft in power language with easy
feature and soft safe result. Visual Basic is can do same as C++ soft
without strange outs. The memory corrupt issue of C++ is known well to
profession developer so he chose VB to make productive result and happy
coding for boss.
"Efficiency" and "elegance" are fuel for a flamewar. I think the
point of this post can be neatly summarised thus:
We can try to write a program which solves a particular problem. Or
we can write a program that solves the problem, *and* does it with such
and such ressources constraints (time, memory, electric power...).
Obviously, the latter is harder to do than the former, because the
solution space is smaller. Also, whatever quality we want our program to
maximize (such as "elegance") will be lower if your program has to come
from the smaller solution space.
Well, tough luck.
On the other hand, don't forget that it's often easier to optimise an
initially elegant program than to write efficient code right away. The
second path often lead to less efficiency than the first. (I kow a few
slow C++ programs.)
Plus, a Sufficienly Smart and Insanely Complex Compiler will sweep
much (but not all) inelegance under the rug, giving the user an illusion
of having it both ways. See for instance modern garbage collectors,
compared to a mere stop & copy one. Much uglier, yet the user won't
see a thing.
Regarding smart compilers and runtimes: there's only so much that you
can expect from these; and as to zero-overhead abstractions – they are
lesser abstractions.
As a simple example – an array that knows its length and compares
indexes to it is a tad larger and noticeably slower. One way around it
is a system trying to elide the checks and the extra space needed to
keep the length; this can be a good thing in the sense of improving
somewhat on the dumber system but it's not going to take all of the
overhead away. Another way around it is not doing checks and not
carrying length; a result is interfaces like std::transform which are
clumsy (you need to manually allocate stuff before calling it) and
unsafe.
Basically abstractions are often better if they have non-zero cost
that can't be fully elided by a runtime system.
I'm not saying that a tradeoff means you can't improve on something
existing along all dimensions – you may be able to do that if what
you're improving indeed leaves room for such improvements. All I'm
saying is that at some point you'll have to start making a compromise
along some dimension, and here different systems will diverge, each
preferring a different tradeoff.
Given that human abilities and time are rather limited, setting
limited design goals from the outset is not a bad idea.
qsort has something to do with language design: std::sort inlines the
comparsion function whereas sort has to call via function pointer.
I don't get the argument, as I don't get the 'law of leaky
abstractions' : abstractions are built by omission of details; sometimes
the right details for your purpose are omitted and you win; sometimes
the wrong details are omitted and one looses; all that doesn't mean that
abstractions are bad ;-)
> a result is interfaces like std::transform which are
> clumsy (you need to manually allocate stuff before
> calling it) and unsafe.
I don't know where you got that impression. Taking inspiration from
the Perl program Yegge mentions at https://sites.google.com/site/steveyegge2/lisp-wins
(and using C++11):
int foo[] = { 1, 2, 3, 4, 5 };
std::vector bar;
std::transform(std::begin(foo), std::end(foo), std::back_inserter(bar),
[] (int i) { return i * i; });
std::copy(bar.begin(), bar.end(), std::ostream_iterator(std::cout, "
"));
Sure, you can reserve space in bar to avoid piecemeal allocations
(bar.reserve(5)), but that's hardly a damning requirement.
> If Stroustrup added all those features on top of a
> safe core then you could argue that he's more
> worried about elegance than efficiency.
You may not be aware that in the '70s, Stroustrup worked on a
computer simulation at Cambridge using what at the time was considered a
supercomputer. He wrote his program in Simula, and the overhead of
garbage collection almost made the whole thing fall apart. When he
stared work at Bell Labs, his goal was to create something almost as
elegant as Simula, but that wouldn't bring average hardware to its
knees.
Clearly computers have changed a lot in forty years; and things are
less computationally expensive today. And I'll be the first in line to
complain about the silly quirks in C++, or for that matter any other
language I regularly program in (C#, Java, Perl, Javascript, Python).
But I will tell you that your central premise is wrong: elegant code
does not have to be inefficient. And efficient code does not have to be
ugly.
Unfortunately, programmers continue to believe that they *must* use
C++'s low level features if they want incredible performance. Yes, if
you malloc up 100,000 objects, you are probably not going to remember to
free each of them. But you shouldn't malloc 100,000 objects. You should
use some kind of strategy that makes memory manageable; which can be
RAII, memory pools a la APR ( http://apr.apache.org/docs/apr/1.4/group__apr__pools.html
), garbage collection, or something else. The fact that some programmers
actually do try to manage 100,000 lifetimes shouldn't be taken as proof
of anything other than, perhaps, a need for more widespread knowledge
that there are better ways to do it.
If I understand things correctly then C++ has one winning point: it
says that there must be a more elegant way to build compile time
abstractions than abusing the c pre-processor.
I don't know if templates are really a better way, but when they work
Bjarne is right, and when I have to fight template compiler errors then
Bjarne must have been wrong ;-)
back_inserter... no comments.
"But I will tell you that your central premise is wrong: elegant code
does not have to be inefficient. And efficient code does not have to be
ugly."
I claim that elegant code often has to be less efficient and
efficient code has to be more ugly – that there's a tradeoff.
It's like I said that you can buys euros but that'd cost you dollars and
you said you disagree that a man can't own both dollars and euros.
It'd be more interesting to discuss my more algorithmic examples
before delving into the flame war about C++ which I'm sorry I've
mentioned by now, since everybody seems to ignore the algorithmic
examples.
What about C++'s dangling references and overflowing buffers? Isn't
that at the very least a cost in programmer time and correctness if not
elegance, and isn't this cost paid for efficiency?
Fixed-point is not that much awful. All-in-all it's just an
integer.
And you can made it much more friendly, if you choose your units
right. If you want units of length and choose fixed-point meters, may be
just use mm and get rid of point at all?
With yummy 64-bit integers you can represent 2 light years with mm
precision.
Or 460 worth of Earth equators up to nanometer.
Physical units in fixed point are not that bad. What's horrible is
algebra in fixed point – such as the determinant of a matrix. This stuff
doesn't really have units and it's very easy to compute utter garbage
with fixed point algebra.
I realize by now that I'm not going to convince you. Although your
most recent reply mentioning algorithmic elegance suggests that we may
not be as far apart as I originally believed.
I think one problem is that we might be using "elegance" for
different concepts. In programming, at least, I don't think of elegance
as a synonym for expensive or for additional layers of complexity. Quite
the contrary: I define elegant as straightforward code that does what it
should with minimal fuss.
By that measure, I consider much of Hakmem ( http://en.wikipedia.org/wiki/HAKMEM ) elegant, even
though (or, perhaps, especially because) the code is also concerned with
efficiency. Likewise, I consider Dijkstra's smoothsort ( http://www.keithschwarz.com/smoothsort/ ) an elegant
response to an inefficiency in heap sort. But I can't figure out why use
of a big number library would necessarily be elegant under that
definition (the example given in the original post).
Your most recent post states your position is only "that elegant code
often has to be less efficient and efficient code has to be more ugly –
that there's a tradeoff." I agree that efficiency and ugliness are
weakly correlated, but I've read enough of Dijkstra's writings to
believe that, generally speaking, efficient code is ugly because the
programmer in question made the wrong tradeoffs.
Personally, I write code as straightforward as I need to. Since I
don't work in the kernel, that is nearly always fast enough. If I need
to improve on efficiency, I don't do it by uglifying the code to shave
off a microsecond here or there; I do it by finding a better algorithm.
There are many efficient algorithms that can be coded elegantly in
languages that aren't designed to get in your way or require significant
ceremony. This approach isn't as common as I would like, and it often
requires a little more upfront design than the alternatives, but it's
the best way I know to avoid being embarrassed by code six months after
writing it.
"I define elegant as straightforward code that does what it should
with minimal fuss."
Efficiency tends to add a lot of fuss; classifier cascades being my
most "fussy" example perhaps. It is more mathematically elegant, less
buggy and quicker to write when you just apply one good classifier to
every coordinate, and it is a better program in every way except for
being prohibitively expensive.
As to the big number library or a symbolic math library (to be able
to precisely represent pi or the square root of 2, at least until they
cancel out in case they do) – it's more elegant in the sense of being
able to write straightforward code relying on nothing but algebraic
truths. I think it's largely the same thing.
"I've read enough of Dijkstra's writings to believe that, generally
speaking, efficient code is ugly because the programmer in question made
the wrong tradeoffs."
Dijkstra – a perfectionist technologist if there ever was one! The
same Dijkstra that was appalled by a presentation about a debugger and
asked why debuggers are even desirable and how many bugs we're going to
tolerate ("Seven!")
It is sometimes possible to reach something efficient and elegant by
meditating for a long, long time. This is why I included "programmer
time" in the discussion, as well as correctness – other qualities that
Stroustrup doesn't think are at odds with efficiency but they obviously
are; if you don't have the time for the long meditation needed to come
up with the correct, elegant and efficient answer, then the tradeoff is
still more obvious.
The difference between our opinions is quantitative, sort of – you
think it's "loosely correlated", I think it's very strongly correlated;
it's obviously hard to argue about these things, and one's worldview is
shaped by what one is working on, what examples such a discussion brings
to one's mind, and what one's standards with respect to "acceptable
efficiency", "acceptable elegance", etc. are. So I certainly don't
expect to convince anyone unless they happen to be on the verge of being
convinced when they stumble upon my writing...
"Physical units in fixed point are not that bad. What's horrible is
algebra in fixed point – such as the determinant of a matrix."
Well, perhaps, determinant is not that good as example, as it has
units – original units^n. All in all, its absolute value even have a
physical meaning.
But i got your idea, joys of math for the sake of math. :)
May be thing in that you are not solving problem, you rather convert
it from one system into another.
Perhaps, if determinant was originally stated in fixed point it will
be quite a hassle to convert it into pure symbolic domain.
Well, it's worse than ^N because the matrix itself can be, say,
moments, and then it's really hairy units. You can of course track the
units through any depth of computations, I'm just saying it's becoming
increasingly less useful; scaling meters by 100 is nice, determinants of
matrices with moments involving meters are less nice, especially
considering their rather dynamic range and your reduced ability to say,
"I don't care about small precision issues" as you can when it's a meter
and you say "I don't care about precision below 1 cm".
I've worked on production code in fixed point for a couple of years –
very far from math for the sake of math; I know how it can get ugly.
Wow, I was about to close this tab and reread what I had wrote, and
noticed that I had made a typographical error. I meant to say that I
don't disagree with you on Python's efficiency. So yes, you have been
right and will likely continue to be right.
A nice article, but I think you are significantly misinterpreting the
point of numerical algorithms. Most problems worth doing flat out don't
have known analytical solutions, or if they do they are written in terms
of things that can't be evaluated for most arguments anything but
numerically (say, as confluent hypergeometric functions). So they just
can't be done analytically. If you try to do almost anything non-trivial
in Mathematica, Maple, etc. you end up doing it numerically anyways —
and the problem then is that they have a tendency to chose non-optimal
numerical methods. Arbitrary precision arithmetic is cute, but, speaking
as a scientific programmer, if somebody found a way to make it just as
fast as floating point tomorrow, it would have fairly little effect on
most of the stuff I do.
Well, I might have a wrong bias from your point of view because I'm
seeing a lot of stuff "worth doing" that is as simple as solving a
linear equation system which very much can be done analytically but is
not always trivial numerically.
Anyway – I'm not saying floating point is the source of all
headeaches, just of one headache...
Great article, thank you for writing it.
So what about LuaJIT with it's clean C API and FFI? Not only is it
much faster than any other dynamic language, it's also nearly as fast as
Java and not far that away from C.
Well, I'm not saying that X can't be more efficient and elegant than
Y at the same time, just that at some point, you start needing to make a
trade-off.
With me, C is a scripting language; if you want to really max out
what a given chip manufacturing technology gives you, you need custom
processors with custom programming interfaces making even OpenCL look
pretty. But even with C the scripting language and a teeny bit of
intrinsics, I sort of think I have very good chances against LuaJIT
where 8-bit/16-bit data types are involved, etc.
" The efficiency vs. correctness, efficiency vs. programmer time,
efficiency vs. high level, etc. dichotomies are largely bogus."
Such a claim could be considered plausible in a world in which C++
doesn't exist ... but C++ quite falsifies it.
I think there is a basic economic reason for the trade-offs
seen
between different desirable qualities. Solutions that are both less
elegant and less efficient evolve out; strictly worse choices are
already been eliminated the set of alternatives.
It is for the same reason why computer memory is either cheap or fast
by the byte, or why we usually have to trade off space for time when
designing algorithms.
"But even with C the scripting language and a teeny bit of
intrinsics, I sort of think I have very good chances against LuaJIT
where 8-bit/16-bit data types are involved, etc."
The LuaJIT FFI gives you full access to C data types and under some
circumstances will do optimizations for you that would require tedious
manual effort in C.
Here’s someone trying to prove you wrong, Yossi: On the Cleverness of Compilers.
You know, it's amazing just how ugly Lisp can be, what with all the
cars and cdrs.
As to proving me wrong... He says that efficiency comes from
specialization; the only question is how much of this specialization can
happen automatically, and I think we can all agree that some of the
special circumstances that make optimizations possible are only known to
the programmer and if he doesn't describe them to the compiler, the
optimization won't happen. The only question is how significant the
effect of this fact is, and how high-level and "nice" these description
can be.
Yes. The reason I say Alexey is trying to prove you wrong is that
while the specificity vs generality dichotomy is tautologically
identical to (one axis of) efficiency, efficiency vs elegance isn’t.
Your thesis is that the dichotomies are still equivalent, which history
so far bears out. Alexey’s work is an attempt to show otherwise – will
he succeed, who knows... so I’m mentioning this as a ray of hope more
than as an argument. Wouldn’t it be nice if elegance and efficiency
proved to be compatible... and a pony.
This just seems like survivorship bias. An algorithm / tool /
technology tends not to remain in use if something that is both more
elegant and more efficient is available. Therefore, if you survey the
well known options you'll see a negative relationship between elegance
and efficiency. It doesn't imply a general relationship nor does it
preclude the discovery of an approach that is both more elegant and more
efficient than anything available today.
Post a comment