If a tree falls in a forest, it kills Schrödinger's cat
Schrödinger used to have this quantum cat which was alive
and dead at the same time as long as nobody opened the box, and it was the very act of looking at the cat that made it
either alive or dead. Now, I'm not sure about this quantum stuff, but if you ask me you'd always find a dead cat upon opening
the box, killed by the act of not looking. In fact, if you open any random box nobody was looking into,
chances are you'll find a dead cat there. Let me give an example.
I recently chatted with a former employee of a late company I'll call Excellence (the real name was even worse). Excellence
was a company with offices right across the street that kept shrinking until the recent financial crisis. It then had to
simultaneously fire almost all of its remaining employees, carefully selected as their best during the previous years when other
employees were continuously fired at a lower rate. Giving us a whole lot of great hires, including MV, the co-worker in this
story (though he was among those who guessed where things were going and crossed the street a bit earlier).
Rumors tell that to the extent possible, Excellence used to live up to expectations created by its name. In particular,
without being encouraged or forced by customers to comply to any Software Development Methodology such as the mighty and
dreadful CMM, they had (as CMM puts it) not only
Established, but rigorously Maintained an elaborate design, documentation and review process which preceded every coding effort.
Other than praise, MV had little to say about the process, except perhaps that occasionally someone would invent something
awfully complicated that made code incomprehensible, having gone just fine through the review process because of sounding too
smart for anyone to object.
Now, in our latest conversation about how things were at Excellence, MV told how he once had to debug a problem in a core
module of theirs, to which no changes had been made in years. There, he stumbled upon a loop searching for a value. He noticed
that when the value was found, the loop wouldn't terminate – a continue instead of a break kind of thing. Since the right value
tended to be found pretty early through the loop, and because it was at such a strategic place, test cases everyone was running
took minutes instead of seconds to finish. Here's a dead cat in a gold-plated box for ya, and one buried quite deeply.
My own professional evolution shaped my mind in such a way that it didn't surprise me in the slightest that this slipped past
the reviewer(s). What surprised me was how this slipped past the jittery bodybuilder. You see, we have this Integration Manager, one of his hobbies being
bodybuilding (a detail unrelated, though not entirely, to his success in his work), and one thing he does after integrating
changes is he looks at the frame rate. When the frame rate seems low, he pops up a window with the execution profile, where the
program is split into about 1000 parts. If your part looks heavier than usual, or if it's something new that looks heavy
compared to the rest, it'll set him off.
So I asked MV how come that the cat, long before it got dead and buried, didn't set off the jittery bodybuilder. He said they
didn't have one for it to set off. They were translating between the formats of different programs. Not that performance didn't
matter – they worked on quite large data sets. But to the end user, automatic translation taking hours had about the same value
as automatic translation taking seconds – the alternative was manual translation taking weeks or months. So they took
large-scale performance implications of their decisions into account during design reviews. Then once the code was done and
tested, it was done right, so if it took N cycles to run, it was because it took N cycles to do whatever it did right.
And really, what is the chance that the code does everything right according to a detailed spec it is tested against, but
there's a silly bug causing it to do it awfully slowly? If you ask me – the chance is very high, and more generally:
- Though not looking at performance followed from a reasonable assessment of the situation,
- Performance was bad, and bad enough to become an issue (though an unacknowledged one), when it wasn't looked at,
- Although the system in general was certainly "looked at", apparently from more eyes and angles than "an average system", but
it didn't help,
- So either you have a jittery bodybuilder specifically and continuously eyeballing something, or that something sucks.
Of course you can save effort using jittery automated test programs. For example, we've been running a regression testing
system for about a year. I recently decided to look at what's running through it, beyond the stuff it reports as errors that
ought to be fixed (in this system we try to avoid false positives to the point of tolerating some of the false negatives, so it
doesn't complain loudly about every possible problem). I found that:
- It frequently ran out of disk space. It was OK for it to run out of disk space at times, but it did it way too often now.
That's because its way of finding out the free space on the various partitions was obsoleted by the move of the relevant
directories to network-attached storage.
- At some of the machines, it failed to get licenses to one of the compilers it needed – perhaps because the env vars were set
to good values with most users but not all, perhaps because of a compiler upgrade it didn't take into account. [It was OK for it
to occasionally fail to get a license (those are scarce) - then it should have retried, and at the worst case report a license
error. However, the compiler error messages it got were new to it, so it thought something just didn't compile. It then ignored
the problem on otherwise good grounds.]
- Its way of figuring out file names from module names failed for one module which was partially renamed recently
(don't ask). Through an elaborate path this resulted in tolerating false negatives it really shouldn't.
- ...
And I'm not through with this thing yet, which to me exemplifies the sad truth that while you can have a cat looking at other
cats to prevent them from dying, a human still has to look at that supervisor cat, or else it dies, leading to the eventual
demise of the others.
If you don't look at a program, it rots. If you open a box, there's a dead cat in it. And if a tree falls in a forest and no
one is around to hear it, it sucks.
Well, if there's one thing I'm convinced is useless, it's got to be
profilers. I turned around on debuggers but only because they actually
became useful over time, but there's an inherent problem with profiling
like using a debugger in that it seems to completely turn off people's
brains.
So you get results like someone coming to you and saying that the
performance bottleneck of an app is string concatenation. So you give a
nomcommital grunt instead of shame them publicly in the interest of a
happy workplace, and off they go rewriting string classes. Making
stringbuffers for concatenation, and then they throw a meeting about it
and two of the would-be alpha coders get into a long argument about
whether char* or string is better to use. As if any time, ever, the
performance of string concatenation mattered to someone who made a sane
program.
Then the boss offers you a delicious can of soda if you find out why
'the boys' are in a big uproar the last couple days. So you go and look
and see wtf is really happening. Oh! A parser. Fancy that. A parser
written without a scanner! And no one knows what that is. Or why strings
concatenation and parsers don't go together well except in throwaway
code (which, if it were, it ought to be in Perl or some laguage that is
actually good with strings). Likewise, ideas like tokens and grammars
are totally alien.
After about the third time of this happening, the flaw's apparent no
matter how apathetic you are. The problem is that a profiler can help
with obvious mistakes like you describe but usually performance problems
are not related to stupid mistakes – and the ones which are are easy for
any peabrain to find anyway.
Of course we all get annoyed by quotes from CSC 101 about not
optimizing or optimizing only by algorithm use, but algorithm aside
there's always something that is holding you back but usually it's
something like overbloated code size, disk access, oversimple blocking,
etc. etc. All things you can't figure out by instruction counting, and
nowadays instruction counting is never the limiting factor anyway so it
becomes doubly useless. Not to mention you can't optimize everything, so
hunting through each scrap is useless, especially when the people
causing problems usually cause them systemwide, not in a way that shows
up in the code they own. Such as if your newly minted CS grad is
assigned some init process that allocates a huge amount of memory,
mostly in tiny chunks via string concatenation, hopelessly fragmenting
your memory before any real work gets done.
Ultimately, I think the only real thing is testing, plus having
people who know what they are doing. It's just like having a rehearsal
for a big play. It's guaranteed to never work without rehearsal, no
matter how great the actors or the play, but with enough rehearsal even
the worst play with awful actors could at least complete without a major
mishap. Performance gains can also be astounding, but it's almost always
a case of either a waste of time, or requiring so much in detail
knowledge of every bit of C++ and your app that it is a big waste of
effort of what's probably your best technical asset.
@antiguru: :) Seriously! Profilers (and the kind of issues they
identify) are simply overrated. Agree a hundred percent about the play
analogy; but it also has to be seen that the actors need to be smart
enough (to a minimum extent at least) and have some presence of mind,
for Murphy would always be around.
@Yossi: A good post, thanks. Once, my team was asked to look into a
peculiar problem with a strange pattern, involving elements in a list. A
deeper inspection of the underlying library (which was built some five
years ago, no fixes done till date) revealed a dead cat: an off by one
error. Why did it not stink till now? None tested it with a case which
would have failed. :D
Profilers are useless? That's the dumbest idea I've heard in a
while.
Having recently adopted two adorable kittens, I thought you were
going to make the obvious point "cats in boxes die". I tried to
transport my two kittens in a box, but they quickly escaped and crawled
underneath of the brake pedal, so cats not in boxes almost made us all
die.
Instead, what I'm getting out of this is that cats need to be looked
after by humans or other cats, as long as a human is watching a cat
somewhere. I'm not watching my cat chew an electrical wire right now, I
hope that it doesn't die.
I completely empathize with the desire to automate everything. I'd
like to automate as much of my life as possible. Clearly, the important
thing to consider is that the automation process if easily made
fallible, and that someone is going to notice eventually.
Problems are better noticed by a jittery bodybuilder in the office,
than a run over cat on the street.
@Antiguru: if performance isn't measured, it becomes crap; I'm not
saying that it can't become crap otherwise or that sometimes it's OK
when it's crap, just that it will become crap if not measured. As to
profilers – you generally need custom ones, timers at logically
meaningful places plus counters of "things" (pixels processed/basic
blocks optimized/whatever).
@raghava: that an uncovered case didn't work is perfectly legitimate
and basically unavoidable, what is avoidable is covered cases that work
defectively though.
@Tommy: it seems cats and death go well together. Hence the nine
lives, presumably.
from the tao:
A well-used door needs no oil on its hinges.
A swift-flowing stream does not grow stagnant.
Neither sound nor thoughts can travel through a vacuum.
Software rots if not used.
These are great mysteries.
"Premature optimization is the root of all evil." But after a certain
point, a well-designed profiler can expose contradictory assumptions
between a subroutine and its caller.
Of course sometimes dead cats were simply put on top of leaky
drainpipes; especially it seems when it isn't your code and sometimes
because it IS your code [or mine :P]
I would always hesitate before touching anything that's been running
for years [profiled badly or not]...
Well, I'm conservative enough to hesitate to touch anything that's
been running for weeks if not days, it's just that sometimes you have
to.
Not using a profiler doesn't mean never measure performance. It just
means that you don't get sucked a false idea of how optimization is
supposed to work which seems to be the case for everyone I personally
know using them.
A argument I got into recently was against virtual methods being
slow. My idea was to use pure virtual methods to accomplish some tasks
in a sort of base node object. Mr scruples just couldn't allow the word
virtual to pass his fingers, let alone a pure virtual.
He even made up a case and executed it a billion times or so to prove
his point. It was a little faster, not not a lot. But I knew for a fact
it was actually faster in a real world app. The virtual lookup is static
data. Instead there's this dumb 'control' object that is an object with
its own function pointer. So it's not static but there for every single
listener.
So I make a realworld case and tahdah! It's obvious the virtual
method is faster in the case we actually care about. and the margin
wider than it had been in the BS case.
The reason is as the bits fly none of this crap is in cache any more,
and what's worse is it's not together in memory and it has some bizarre
indirection, so you wind up doing 2-3 cache mishes every single lookup
when things are not in chache. Plus, it takes up more cache and pretty
much everything is a node, so this was also degrading the whole system
by helping choke the cache out.
Of course not everyone just knows off the top of their head, but you
are going to solve 100X more perfomance issues by knowing than by just
measuring and coming to an obvious conclusion.
Of course you need some mental model of why things could take the
time they take to make sense of measurements; and, when you write
optimized code, obviously you're not doing measurement-driven iteration
(change, measure, change, measure...) but rather have some overall plan
of an optimized implementation, having weighed alternatives in your head
rather than done measurements for N implementations.
All I'm saying is that on top of that, when you have an app coded by
reasonable people, where performance isn't then continuously measured,
it becomes crap.
One phrase I found (sadly) true is that if everything works then it's
just because you didn't look closely enough.
It's *so* common in complex systems to look for a bug and then after
some hunting you think you nailed it down and say "AH!... here it is!!
when we introduced that change this part of the program wasn't updated
correctly and .. blah blah blah" and indeed you found a true genuine
bug!
But it's another one, not the one you were chasing...
dead cats are everywhere...
Andrea
PS: For code efficiency is even worse... especially when people
starts doing "optimizations" without using a decent (i.e. passive)
profiler first. I've seen a major code rewrite because of a single
operation that was O(n^2) and taking 99.9% of cpu time and could have
been coded as O(1) with minimal impact on the rest of the program.
After the change computation in one use case went down from 25 minutes
to 3 seconds (and those were mostly the update of the progress bar that
was added to inform users that the system wasn't indeed dead).
Yes, profilers are useless in most cases. Problem is that good
programmer will use correct and efficient method from the start and bad
programmer can't improve by only using a profiler or come other tool,
instead of reading some good books.
Existence of profilers and optimizing compilers is being used as excuse
for writing sloppy code. Today many students are taught: "write working
code first, think about speed later( usually never )".
P.S. Sorry for bad english.
@noop:
"Existence of profilers and optimizing compilers is being used as excuse
for writing sloppy code. Today many students are taught: “write working
code first, think about speed later( usually never )”."
People are taught this for a good reason. If you try to write
"correct and efficient" code from the start as you recommend, you are
making your task artificially more difficult, and therefore increasing
the likelihood of making errors.
Instead, a more robust approach would be to focus on writing correct
code first, with some decent test coverage, before reviewing your design
and code and refactoring/optimising.
This is not to say that you should purposely write "sloppy" code or
reject all thoughts of performance, just that the focus should not be on
writing optimal code in the first pass. To presume that you can write
correct and well-optimised code on the first attempt is optimistic and
rather dangerous.
Going back to profilers, I don't buy the idea that they only help
people to solve trivial optimisation problems.
A more important use is to justify where optimisation is worthwhile.
For example, imagine a programmer comes to a team meeting claiming
that some important feature was currently taking O(n^3) time when it
could be O(n log n) with another algorithm, and that we should replace
the algorithm immediately.
Someone raises their hand and asks how much time this would save. Nobody
answers, so the programmer does a profiling run and discovers that the
"slow" algorithm currently takes 50ms, which is dwarfed by the slow
network queries that are performed immediately afterwards.
Like any tool, profilers can be misused, but they certainly have some
value.
Post a comment