Swimming beaver

Lack of wealth through lack of empathy

A few days ago a coworker stepped into the room of our Integration Manager, with his iPhone in his hand and his face glowing with happiness, telling he has this great new game on his phone. He asked for a permission to take a picture of the Integration Manager, and did so to the sound of dramatic, though somewhat repetitive, music. He then marked the areas around the eyes and the mouth of his subject.

Then the game started. The guy pressed some buttons, delivering virtual blows to the photographed subject, who couldn't really fight back according to the rules of the "game". The face on the photograph got covered with blood and wounds (especially the eyes and the mouth), and the speaker yelled out something designed to convey pain. And the guy stood there and kept punching and laughed his ass off.

Then he talked about how iPhone was this great thing and how iPhone app developers got rich. And I told him in a grim voice to go ahead and get rich off iPhone apps and he sensed disrespect, to him or to the iPhone or both. But mainly it wasn't disrespect, it was desperation. I told him to go ahead and do these apps because he probably can, and I certainly can't, so I have no choice but leave all this wealth for him to collect.

Take this punching app. Where do I fit in?

For instance, I know enough computer vision to eliminate the need to manually select the eyes and the mouth. You find the eyes using circular Hough transform, and then you get a good idea where to look for the mouth and then it can't be too hard. How much value would this automation add? The happy user I observed didn't seem bothered by the need to select, worse, I suspect having to do it made him happier – this way he actually worked to deliver his punches.

And then what I couldn't do, and I couldn't do this for the life of me, is to invent this app in the first place. It's just something that I don't think I'll ever understand, something worse than, I dunno, partial differential equations which you can keep digging into, something more like the ability to distinguish between colors, which is either there or not.

Could someone PLEASE explain just how can such a profoundly idiotic activity yield any positive emotions in a human soul?

Perhaps it's because our Integration Manager is a bodybuilder, with bulging biceps and protein milkshakes and stuff, so there's no other way to beat him up for that guy? Well, he could probably beat me up, and he still enjoyed going through the process with me. So this just shows how childish my attempt to dissect the psychology of the phenomenon is – I'm not even close. How could I ever think of something I can't even begin understanding after I've already seen it?

The serious, or should I say sad, side of this is that to develop something valuable to users, you need empathy towards these users, and the way empathy works is through identifying with someone, and so the best chance by far to develop something valuable is to develop something for yourself. But someone like me, who bought his first PC in 2007 and uses a cell phone made in 2004, with no camera, no Internet connection, no nothing, simply can't develop something for himself because he clearly just doesn't need software.

The only chance for a programmer who uses little software except for development tools is to develop development tools. Take Spolsky, for instance – here is a programmer who is, by the programmers' standard, extremely user-aware and user-centric, as evident in his great UI book, among other things, and yet he eventually converged to developing software for programmers, having much less success with end-user products.

The trouble with development tools is that the market is of course saturated, with so many developers with otherwise entrepreneurial mindsets being limited by their lack of empathy to this narrow segment of software. This is why so many programs for programmers have the price of zero – not because copying software has a cost near zero; developing software has a cost far from zero so there aren't that many free face punching apps, but with software for developers, there's unusually fierce empathy-driven competition.

So every time I see an inexplicably moronic hit app, I sink into sad reflections about my destiny of forever developing for developers, being unable to produce what I can't consume and watching those with deeper understanding of the human nature reaping all the benefits of modern distribution channels.

P.S. To the fellow programmer who, as it turns out, uses FarmVille – you broke my heart.

API users & API wrappers

Suppose you have a sparse RAM API, something along the lines of:

  • add_range(base, size)
  • write_ram(base, bytes)
  • read_ram(base, size)

People use this API for things like running a simulated CPU:

  1. define the accessible memory with add_range()
  2. pass the initial state to the simulator with write_ram()
  3. run the simulation, get the final state with read_ram()

Suppose this API becomes a runaway success, with a whopping 10 programmers using it (very little irony here, >95% of the APIs in this world are used exclusively by their designer). Then chances are that 9 of the 10 programmers are API users, and 1 of them is an API wrapper. Here's what they do.

API users

The first thing the first API user does is call you. "How do I use this sparse thing of yours?" You point him to the short tutorial with the sample code. He says "Uhmm. Errm…", which is userish for "Come on, I know you know that I'm lazy, and you know I know that docs lie. Come over here and type the code for me." And you insist that it's actually properly documented, but you will still come over, just because it's him, and you personally copy the sample code into a source file of his:

add_range(0x100000, 6) # input range
add_range(0x200000, 6) # output range
write_ram(0x100000, "abcdef")
# run a program converting the input to uppercase
print read_ram(0x200000, 6) # should print "ABCDEF"

It runs. You use the opportunity to point out how your documentation is better than what he's perhaps used to assume (though you totally understand his frustration with the state of documentation in this department, this company and this planet). Anyway, if he has any sort of problem or inconvenience with this thing, he can call you any time.

The next 8 API users copy your sample code themselves, some of them without you being aware that they use or even need this API. Congratulations! Your high personal quality standards and your user-centric approach have won you a near-monopoly position in the rapidly expanding local sparse RAM API market.

Then some time later you stumble upon the following code:

add_range(0x100000,256)
add_range(0x200000,1024)
add_range(0x300000,1024)
...
add_range(0xb00000,128)
...
add_range(0x2c00000,1024)
...

Waitaminnit.

You knew the API was a bit too low-level for the quite common case where you need to allocate a whole lot of objects, doesn't matter where. In that case, something like base=allocate_range(size) would be better than add_range(base,size) – that way users don't have to invent addresses they don't care about. But it wasn't immediately obvious how this should work (Nth call to allocate_range() appends a range to the last allocated address, but where should the first call to allocate_range() put things? What about mixing add_range() and allocate_range()? etc.)

So you figured you'd have add_range(), and then whoever needed to allocate lots of objects, doesn't matter where, could just write a 5-line allocate_range() function good enough for him, though not good enough for a public API.

But none of them did. Why? Isn't it trivial to write such a function? Isn't it ugly to hard-code arbitrary addresses? Doesn't it feel silly to invent arbitrary addresses? Isn't it actually hard to invent constant addresses when you put variable-sized data there, having to think about possible overlaps between ranges? Perhaps they don't understand what a sparse RAM is? Very unlikely, that, considering their education and experience.

Somehow, something makes it very easy for them to copy sample code, but very hard to stray from that sample code in any syntactically substantial way. To them, it isn't a sparse RAM you add ranges to. Rather, they think of it as a bunch of add_range() calls with hexadecimal parameters.

And add_range() with hex params they promptly will, just as it's done in the sample. And they'll complain about how this API is a bit awkward, with all these hex values and what-not.

API wrappers

If there's someone who can see right through syntax deep into semantics, it's the tenth user of your API, or more accurately, its first wrapper. The wrapper never actually uses an API directly in his "application code" as implied by the abbreviation, standing for "Application Programming Interface". Rather, he wraps it with another (massive) layer of code, and has his application code use that layer.

The wrapper first comes to talk to you, either being forced to use your API because everybody else already does, or because he doesn't like to touch something as low-level as "RAM" so if there's already some API above it he prefers to go through that.

In your conversation, or more accurately, his monologue, he points out some admittedly interesting, though hardly pressing issues:

  • It's important to be able to trick a program using the sparse RAM API into allocating its data in specific address ranges, so that the resulting memory map is usable on certain hardware configurations and not just in simulations.
  • In particular, it is important to be able to extract the memory map from the section headers of executables in the ELF and COFF format.
  • Since add_range() calls are costly, and memory map formats such as the S-Record effectively specify a lot of small, adjacent ranges, there is a need for a layer joining many such ranges.
  • An extensible API for the parsers of the various memory map formats is needed.

When you manage to terminate the monologuish conversation, he walks off to implement his sparse RAM API on top of yours. He calls it SParser (layer lovers, having to invent many names, frequently deteriorate into amateur copywriters).

When he's done (which is never; let's say "when he has something out there"), nobody uses SParser but him, though he markets it heavily. Users won't rely on the author who cares about The Right Thing but not about their problems. Other wrappers never use his extra layers because they write their own extra layers.

However, even with one person using it, SParser is your biggest headache in the sparse RAM department.

For example, your original implementation used a list of ranges you (slowly) scanned through to find the range containing a given address. Now you want to replace this with a page table, so that, given an address, you simply index into a page array with its high bits and either find a page with the data or report a bad address error.

But this precludes "shadowing", where you have overlapping segments, one hiding the other's data. You thought of that as a bug in the user code your original implementation didn't detect. The wrapper thought it was a feature, and SParser uses it all over to have data used at some point and then "hidden" later in the program.

So you can't deploy your new implementation, speeding up the code of innocent users, without breaking the code of this wrapper.

What to do

Add an allocate_range() API ASAP, update the tutorial, walk over to your users to help replace their hex constants with allocate_range() calls. Deploy the implementation with the page table, and send the complaining wrapper to complain upwards along the chain of command.

Why

Your users will switch to allocate_range() and be happy, more so when they get a speed-up from the switch to page tables. The wrapper, constituting the unhappy 10% of the stakeholders, will have no choice but fix his code.

Ivan drank half a bottle of vodka and woke up with a headache. Boris drank a full bottle of vodka and woke up with a headache. Why drink less?

Users are many, they follow a predictable path (copy sample code) and are easily satisfied (just make it convenient for them to follow that path). Wrappers are few, they never fail to surprise (you wouldn't guess what and especially why their layers do), and always fail to be satisfied (they never use APIs and always wrap them). Why worry about the few?

The only reason this point is worth discussing at all is that users offend programmers while wrappers sweet-talk them, thus obscuring the obvious. It is natural to feel outrage when you give someone an add_range() function and a silly sample with hex in it, and not only do they mindlessly multiply hex numbers in their code, but they blame you for the inconvenience of "your API with all the hex in it". It is equally natural to be flattered when someone spends time to discuss your work with you, at a level of true understanding ("sparse RAM") rather than superficial syntactic pattern matching ("add_range(hex)").

He who sees through this optical illusion will focus on the satisfaction of the happy many who couldn't care less, securing the option to ignore the miserable few who think too much.

Digital asses in the computing industry

Ever noticed how academic asses are analog and industrial asses are digital? It's legitimate to not know whether P equals NP, or to not know what x is if x*2=y but we don't know y, for that matter. But it isn't legitimate to not know how many cycles, megabytes or – the king of them all – man-months it will take, so numbers have to be pulled out of one's ass.

The interesting thing is that the ass adapts, that the numbers pulled out of this unconventional digital device aren't pure noise. Is it because digital asses know to synchronize? Your off-by-2-months estimation is fine as long as other estimations are off by 5. But it's not just that, there must be something else, a mystery waiting to be discovered. We need a theory of computational proctology.

Ever noticed how painful the act of anal estimation is for the untrained, um, mind, but then eventually people actually get addicted to it? Much like managers who learn that problems can be made to go away by means such as saying a firm "No", without the much harder process of understanding the problem, not to mention solving it? Anal prophecy is to the technical "expert" the same raw enjoyment that the triumph of power over knowledge is to the manager. "Your powers are nothing compared to mine!"


There once was a company called ArsDigita (I warmly recommend the founder's blog and have his Tenth Rule tattooed all over my psyche), a name I tend to misread as "ArseDigital" – a tribute to an important method of numerical analysis and estimation in the computing industry.

The Virtue of a Manager

I never managed a group larger than 5 people, luckily for the people in the group (perhaps more so for those remaining outside). Good managers are hard to find, which is the basis of my self-motivating motto: "This job could have been done worse". Such is the background for the hereby presented pearls of wisdom assortment. As to "The Virtue of a Manager" title, it's a ripoff of Paul Krugman's exquisite title "The Conscience of a Liberal". "The Private Part of a Self-Important Self-Description" is a great template.

***

A prime virtue of a manager is the ability to take pride in someone else's work.

No, seriously. We've recently deployed a debugger internally and an algorithm developer had a look at it. I knew it was good, but it's used to debug the sort of thing algo devs hate: code with an anal-retentive performance focus. So the last thing I expected was praise, but praise it the guy did.

Now, I had previously known proud moments from having done things myself, and here I had this proud moment with 90% of the work done by someone else. And I'm telling you, it was just like the real thing.

***

The defining trait of a manager is the distinctly wide gap between responsibility and understanding.

By far the funniest spot to have a gap at, hence the easiest target for a low blow: try to make jokes about a gap between one's teeth and you'll soon be exhausted, but this here is gold. This is mean-spirited though. Imagine living with a gap between your responsibility and your understanding and everybody laughing at you – how would that make you feel? Show compassion.

***

One can have the title of a manager or nominal reports for any of a number of reasons:

  • An HR system with per-title wage ceilings: can't give someone a raise without faking a title.
  • A diametrically opposed case: some forms of brain damage cause people to accept lower paychecks given more impressive titles, larger rooms, etc.
  • Someone is too senior to report to a team leader but doesn't want a team to report to him, either.

In a roomful of managers, how do you find the real ones among this variety – not "real" as opposed to incompetent or unimportant, but "real" as opposed to fake?

There are several cues, for example, only real managers can have other managers report to them. But the perfect, if-and-only-if discriminator is that real managers don't write code. (The precise rule is that they can spend up to 2% of their time on a favorite piece of code without getting disqualified.)

***

The principal function of a manager is being the responsible adult.

Some managers occasionally point this out in frustration, both mourning their technical skills which dry up during their current gig where they only get to exercise adulthood, and because being the adult means getting tired of the annoying kids. A gal who both managed and met literally hundreds of managers during her career in some consulting agency said "Now I really understand management" when she got to babysit.

This is why I have hard time believing management can be taught – you can't teach adulthood, it can only result from people growing up by themselves. I'm not sure if this feeling is fully aligned with reality, but quite some very successful managers never went to a management school (at least one of those is somewhat critical of MBAs), and some of those who went say it was worthless in terms of useful things learned.

The opposite is also true: childishness is fitting for a programmer. We were two fake code-writing managers in a meeting with one real one, and at one point the real one said: "Let's not be childish about this". The technically correct reply to her would have been "I'M NOT CHILDISH ABOUT THIS, HE IS!", but I suppressed it for tactical reasons. Some time later I told her: "You don't want us to stop being childish about this, not as long as you're interested in our output as programmers. Recall: the reason you aren't still programming is because of not being childish enough to truly enjoy this sort of game."

And in fact since she started managing 20 programmers, she's been talking about her work all the time, which she didn't when she was programming. Well, some people like to play and some prefer to babysit. (I'm not sure where this leaves the quasi-managers who write code; presumably some are the elder and most responsible kid while others are the most restless who invent games for the gang.)

***

I've recently got a driving license. One thing I learned was that someone pushing his (presumably broken) car along the road is a "driver" as far as the law is concerned. I find this counter-intuitive, probably because pushing a car is not categorized in my head as "driving experience", but, at least in Israel, that's the law.

Likewise, doing the work of three people is not what most of us associate with "managerial responsibility". However, if you're given two reports without a drive of their own to work, that's what your responsibility will be.

***

A manager will have favorite words. For example: acute (critical), priorities, agenda, rationale, integrity (shoot this manager first), responsibility (ownership), stakeholder.

Keep laughing at them. Once you become a manager, you'll have favorite words whether you want it or not – it is useless to resist the dynamics inherent to your situation. My favorite word is "dynamics". Its connotations are deep and its applicability wide – heartily recommended.

***

Managers get to do a lot of knowledge-free decision making, which necessarily drives them insane. Here's how the manager's bipolar disorder works.

During maniacal periods, the manager is the only one who can do anything around here. This frequently happens when the manager is under external pressure, and he feels that control is slipping out of his hands. He's trying to compensate for his lack of knowledge by immense concentration and willpower. (Managers always have ample emergency supplies of both.) "Concentration" translates to an ability to derive general and far-reaching conclusions from insignificant details, then "willpower" translates to aggression.

Then depression follows: "Don't bother me with details". This results partly from exhaustion quickly arrived at during the mania (especially if reports were wise enough to not argue with the manager, letting his efforts defeat their own purpose.) The manager has delivered his trademark concentration and willpower, so he no longer feels guilty on that front. However, he's overwhelmed by information and (rightly) feels that he doesn't know what's going on. He decides it is none of his business and concentrates on the Big Picture (does nothing). Usually, the cycle repeats upon a new wave of external pressure.

Awareness of the management cycle on behalf of the manager himself can help soften the cycle but not eliminate it. It is up to reports to apply counter-cycle measures by scheduling most work into depression periods when it is least disrupted. Special attention must be given to long-term projects, frequently characterized by a prolonged depressive apathy period at the beginning followed by a period of maniacal frenzy lasting until the end.

***

There's a naive brain model in the spirit of "the brain has a reptilian part, a mammal part and a human part". For example, if a student fails to answer a question in an oral exam with his human brain, the mammal brain feels bad about it and complains to the reptilian brain. The reptilian brain then cheerfully replies, "Who's causing the trouble? Oh, that little guy behind the table? Not to worry – I'll kill him". The higher brains then supposedly suppress this – "What do you think this is, reptile – Jurassic Park?", and the tension is translated into sweating.

The manager is the team's reptilian brain; he doesn't know enough to do real thinking, but he's good at "taking responsibility", bargaining, fighting, socializing, etc. A manager doesn't know how to implement the feature, except for suspecting, based on experience, that it will conflict with a couple other features and it will take a week or three for the whole thing to stabilize (with him taking the heat when things break during those weeks). Therefore, instead of technical advice (which he might be otherwise qualified to give), he'll propose something which solves the problem at his favorite social plane:

  • Prioritize the feature away, delaying the implementation until forever
  • Negotiate the feature away, by talking to whoever wants it out of it for something in return
  • Redefine the feature away, by reducing the scope to the few scenarios which absolutely can't be ignored

Do not drag management into anything you actually want solved. Presented with a question, the manager will answer it by killing the little guy behind the table, so only go to him if you really want that. And once awakened, he might take a lot of sweat to suppress. (If he's really a programmer posing as a quasi-manager, the chances for an actual solution can actually be worse: he's more likely to feel guilty about his managerial ability and use the opportunity to exercise and develop that ability, instead of using his technical ability to think about the issue.)

***

There's this quote from The Mythical Man-Month, supposedly by a pessimistic manager:

All programmers are optimists. Perhaps this modern sorcery especially attracts those who believe in happy endings and fairy god-mothers. Perhaps the hundreds of nitty frustrations drive away all but those who habitually focus on the end goal. Perhaps it is merely that computers are young, programmers are younger, and the young are always optimists. But however the selection process works, the result is indisputable: "This time it will surely run," or "I just found the last bug."

This is backwards. In reality, programmers are the more pessimistic people. Perhaps it's because experience teaches programmers that programs always have bugs while teaching managers that programs always ship. Perhaps it's because the programmer is the one with the actual knowledge, and the ignorant are always optimists. But however the selection process works, how many programmers have you seen saying "it will never work" and how many managers?

A programmer might be more optimistic locally, hoping in vain to have fixed this one piece of code where he has the illusion of complete understanding. However, it is invariably the manager who believes that everything will work out. A programmer can't really believe that because there are so many things nobody even understands that are yet to be faced.

But the manager is used to knowing little and understanding less, and thus has learned to translate uncertainty to optimism. In fact a programmer can learn it, too, in the areas which are of little interest to him. I know a programmer who doesn't care about optimization and who consequently describes others' efforts to fit a program into a given performance budget as doomed to success: "It runs at the word of command" – a programmer's expression of the managerial worldview worthy of a seasoned manager.

***

We don't know how to test for programming ability. The best tech companies spend 5 to 10 interviews to solidly confirm that the candidate knows what is taught during the first 1.5 years of an undergraduate CS curriculum. Other processes measure less accurately by asking less relevant questions; the inaccuracy is somewhat ameliorated by the lack of precision – the non-uniform quirks of interviewers and general randomness of the process eliminate biases, causing all kinds of good candidates to sneak through the gates.

It is well known that we can't find out during the interview what we inevitably find out once someone gets the job, but what are the corollaries? Here's one I've heard a lot: trust recommendations more than interviews. Here's another I haven't: let others interview and get the new hires, then steal the best.

(Objection: the first recommendation is good for the company while the other is only good for the manager following it. Well, "competition between managers over team members isn't a zero-sum game – it improves teamwork across the company", this one we weasel out of in a snap.)

***

We have a VIP club at work called Bottleneck, its principal activity being the collective purchasing and consumption of alcoholic beverages. The club operates during work hours (regular meetings held on Thursdays, emergency meetings scheduled upon arrival of packages from abroad). Our room being the headquarters, I'm naturally a member. By now the club has shifted to high-end liquors at prices causing the consumption to contract to a sip per cup of coffee, but originally it was affordable to actually drink.

I noticed that minor alcoholic intoxication has a notable impact on my programming ability. I can still lay my hands on the right variable, but by the time I do I forget what you do next with these things. There's that handy member somewhere in it, dot something, but dot what?

However, managerial ability is not affected. Things I can do just fine following a meeting of the Bottleneck club include progress monitoring, planning, risk assessment, general technical advice, and requirement negotiation. Now that I think of it, perhaps the managerial functions are affected for the better.

Duck (takeoff 2)

Another duck taking off water, much bigger than the previous. There was also a sitting duck, but I didn't like its glazing.

Fish (front)

The front part of a fish jumping out of the floor or swimming out of a wall, depending.

full resolution

Getting the call stack without a frame pointer

Everything I know about getting the current call stack of C or C++ programs, including ones compiled with -fomit-frame-pointer or an equivalent, with or without a debugger. Hardly entertaining except for those specifically wishing to lay their hands on call stacks.

We'll start with trivia people sharing my unhealthy interest in call stacks probably know. There are two contexts where you might want to get a call stack:

  1. Look at the call stack in a debugger to see what the program is doing.
  2. Get a representation of the call stack inside the program itself. For example, a memory profiler might want to attach the call stack identifying the context where the allocation is made to each memory block to see who allocates the most.

Sometimes (2) can be implemented using (1) by running a debugger programmatically and asking it for the current call stack, and sometimes it can't (too much communication overhead, or no debugger available – for example, a program stripped of debugging information and running on a standalone embedded board).

The straightforward way of getting the current call stack, used both by debuggers and by programs curious about their own stacks, relies on the frame pointer. The idea is that a machine register is reserved for keeping a pointer into the stack, called the frame pointer, and every function is compiled to do the following in its prologue:

  • Push the return address to the stack
  • Push the frame pointer to the stack
  • Save the address of the resulting two-pointer structure to the frame pointer register

This creates a linked list on the stack, with every node keeping a return address – this list is the call stack (and this is why debuggers show the points of return from function calls and not the points of call, a bit annoyingly for function calls spanning many source code lines – in fact we get the return stack and not the call stack). Here's how you get this list from within a program:

struct stack_frame {
  struct stack_frame* next;
  void* ret;
};
int get_call_stack(void** retaddrs, int max_size) {
  /* x86/gcc-specific: this tells gcc that the fp
     variable should be an alias to the %ebp register
     which keeps the frame pointer */
  register struct stack_frame* fp asm("ebp");
  /* the rest just walks through the linked list */
  struct stack_frame* frame = fp;
  int i = 0;
  while(frame) {
    if(i < max_size) {
      retaddrs[i++] = frame->ret;
    }
    frame = frame->next;
  }
  return i;
}

The code for getting the list head pointer depends on the platform, the list structure itself is common to many machines and compilers. The return addresses may be converted to function names and source line numbers with addr2line -f or similar. When the program can't access its own debug info during its execution as in the case of embedded devices, the translation from addresses to names will be a separate offline step.

The whole frame pointer business is fairly widely documented, with a bunch of source code available centered around getting the call stack that way. I think the GNU backtrace function and the Windows StackWalk64 function, which these days are probably a better alternative than code snippets like the one above, also use this linked list when available.

Now, what happens if the compiler is told to avoid generating the code maintaining the frame pointer (-fomit-frame-pointer for gcc, /Oy for MSVC), or not told to override its default behavior of not generating such code (-ga for Green Hills C++)?

Admittedly it doesn't appear to be such a good idea to make debugging that much harder in order to save a few instructions. However, there are reasons to do so. For example, one common consequence of Software Design is lots of functions doing little or nothing except for delegating their work to another function. Without frame pointer maintenance, such a call is just a jump instruction – your callee function will return directly to the address saved by your caller. With frame pointers, you need a whole prologue and epilogue here. Anyway, we won't discuss the benefits of frame pointer omission since measuring the overhead for your particular code will be more reliable than such a discussion anyway.

Compiling without frame pointers hits code trying to obtain its own context harder than it hits debuggers, because debuggers don't really need frame pointers and only (sometimes) rely on them for simplicity of implementation. Given a return address, a debugger can tell:

  1. Which function it belongs to (unlike the program itself, a debugger is necessarily supposed to have access to the symbol table)
  2. Where that function keeps the return address (the compiler knows that, so it can tell the debugger)
  3. The amount by which the function decrements the stack pointer, assuming the stack grows downwards – again something the compiler knows. Now that the debugger knows the previous return address and the previous stack pointer, it can go back to step 1.

So a debugger can do just fine without frame pointers as long as the compiler gives it enough information about the layout of the stack. I've been debugging without frame pointers for a long time with the Green Hills MULTI debugger which uses a proprietary debug info format. More recently the DWARF format, gcc and gdb seem to have caught up and now programs compiled with -fasynchronous-unwind-tables -fomit-frame-pointer are debuggable with gdb. The information generated by -fasynchronous-unwind-tables seems to go to a separate ELF section called .eh_frame_hdr.

Not only will gdb use .eh_frame_hdr, but the GNU backtrace function appears to be using it as well (it doesn't work under -fomit-frame-pointer but apparently does work when you add -fasynchronous-unwind-tables – although the docs explicitly say: "frame pointer elimination will stop backtrace from interpreting the stack contents correctly"). Nor is this section stripped from the program – it's not implemented as a "normal" debug information section but as an allocated data section, so it's always available to a program (in particular, to the backtrace function).

So under gcc, all call stack problems seem to be solved – unless you trust the docs (!?), or unless some code isn't compiled with the right flags because of not being up to date or someone being too greedy to allocate space for a debug info section. Outside gcc, or more precisely DWARF, I don't think a stripped program can access such debug info.

Is there a way to get a call stack without a frame pointer, without a debugger and without debug info?

For years I was sure that the answer was "no", hence some things will only work under a separate build mode – just like the release build but with frame pointers. Then one time the Green Hills debugger failed to list the call stack for some reason as it sometimes does, but this time we really wanted to decipher it. And we figured that we can in fact do the same thing the debugger does, except we'll understand from the assembly code what the debugger usually understood from debug information.

Specifically, to understand where the return address is kept and by what amount the stack pointer is decremented, you need to find the instructions doing (or undoing) these things in the prologue (or the epilogue) of the function. This worked, but due to either inertia or stupidity it took me months to realize that you can write code doing this. Anyway, here's how it works on a 32b MIPS processor under the Green Hills compiler. The prologue code of a function will contain instructions like these:

main+0: 27bdffe8 addiu sp, sp, -0x18
main+4: afbf0014 sw    r31, 0x14(sp)

The add immediate instruction decrements the stack pointer, and the store word instruction saves the return address from the register where it's saved by the caller to some place on the stack. The high 16 bits of these instructions don't depend on the function, encoding the "addui sp, sp" and the "sw, r31 …(sp)" parts. The low 16 bits encode a signed offset. So we can obtain the call stack from our code disassembling it thusly:

/* get previous stack pointer and return address
   given the current ones */
int get_prev_sp_ra(void** prev_sp, void** prev_ra,
                   void* sp, void* ra) {
  unsigned* wra = (unsigned*)ra;
  int spofft;
  /* scan towards the beginning of the function -
     addui sp,sp,spofft should be the first command */
  while((*wra >> 16) != 0x27bd) {
    /* test for "scanned too much" elided */
    wra--;
  }
  spofft = ((int)*wra << 16) >> 16; /* sign-extend */
  *prev_sp = (char*)sp - spofft;
  /* now scan forward for sw r31,raofft(sp) */
  while(wra < (unsigned*)ra) {
    if((*wra >> 16) == 0xafbf) {
      int raofft = ((int)*wra << 16) >> 16; /* sign */
      *prev_ra = *(void**)((char*)sp + raofft);
      return 1;
    }
    wra++;
  }
  return 0; /* failed to find where ra is saved */
}

The call stack will then be produced by the following loop:

int get_call_stack_no_fp(void** retaddrs, int max_size) {
  void* sp = get_sp(); /* stack pointer register */
  void* ra = get_ra(); /* return address register */
  /* adjust sp by the offset by which this function
     has just decremented it */
  int* funcbase = (int*)(int)&get_call_stack_no_fp;
  /* funcbase points to an addiu sp,sp,spofft command */
  int spofft = (*funcbase << 16) >> 16; /* 16 LSBs */
  int i=0;
  sp = (char*)sp-spofft;
  do {
    if(i < max_size) {
      retaddrs[i++] = ra;
    }
  }
  while(get_prev_sp_ra(&sp,&ra,sp,ra));
  return i; /* stack size */
}

get_sp and get_ra access registers so they must be assembly macros, which in the case of Green Hills can be spelled like this:

asm void* get_ra() {
  move $v0, $ra
}
asm void* get_sp() {
  move $v0, $sp
}

Under MIPS32 and Green Hills, this code seems to be giving decent call stacks except for the inevitable omission of function calls done without saving the return address; the most common case of those – simple delegating functions – was already mentioned above. If f calls g which does (almost) nothing except for calling h, and g doesn't bother to save the return address to the stack, having h return directly to f, then the call stack will contain f and h but not g. Not much of a problem and sometimes even an advantage as far as I'm concerned, since g is rarely interesting. Also, you can get this problem irregardless of the lack of a frame pointer – for example, gcc -O3 on x86 will not maintain the call stack accurately even without -fomit-frame-pointer, generating the following ridiculous code:

pushl   %ebp
movl    %esp, %ebp
popl    %ebp
; so what was the point of setting up and
; immediately destroying a stack frame?
jmp     print_trace

Now, although code like get_prev_sp_ra looking for 0x27bd admittedly is a heuristic relying on undocumented platform-specific behavior, it looks like a passable way of getting a call stack on a RISC machine like MIPS, ARM or PowerPC. What about the x86 though? Effectively we have our code partially disassembling itself here, which is not nearly as easy with the x86 (in particular, I don't think there's a way to scan backwards because of the variable encoding length; although we could look at epilogues instead of prologues just as well).

Instead of dragging in a disassembler, we can use an external program, such as, well, a debugger. This obviously defeats the purpose of being able to get the call stack without a debugger. But this purpose isn't very interesting on the x86 in the first place because there you're rarely stuck in situations where a program can't run a debugger.

The only point of the disassembling business on the x86 thus remains to deal with programs compiled without a frame pointer and without debug information making it possible to get the call stack nonetheless. I don't know if anybody has such a problem these days, now that gcc has -fasynchronous-unwind-tables – perhaps someone uses compilers which can't do this or binaries compiled without this, and perhaps the problem is extinct on the x86. For what it's worth, here's a script getting the call stack from a core file without relying on gdb's bt command but relying on its disassemble command. Usage: python bt <program> <core>. No warranty, …or FITNESS FOR A PARTICULAR PURPOSE.

And this is all I know about getting the call stack in C or C++, something users of other languages can do, in the (unlikely) absence of a library function doing just that, simply by throwing an exception, immediately catching it and using its getStackTrace method or some such.

What makes cover-up preferable to error handling

There was a Forth tutorial which I now fail to find that literally had a "crash course" right in the beginning, where you were shown how to crash a Forth interpreter. Not much of a challenge – `echo 0 @ | pforth` does the trick for me – but I liked the way of presentation: "now we've learned how to crash, no need to return to that in the future".

So, let's have a Python & Perl crash course – do something illegal and see what happens. We'll start with my favorite felony – out-of-bounds array access:

python -c 'a=(1,2,3); print "5th:",a[5]'
perl -e '@a=(1,2,3); print "5th: $a[5]n"'

The output:

5th:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
IndexError: tuple index out of range
5th:

Python in fact crashed, telling us it didn't like the index.

Perl was more kind and rewarded our out-of-bounds index with what looks like the empty string. Being the kind of evildoer who's only further provoked by the gentle reactions of a do-gooder, what shall we do to further harass it? Well, it looks like anything makes a good index (and I mean anything: if @a=(11,22), then $a["1"]==22 and $a["xxx"]==11). But perhaps some things don't make good arrays.

python -c 'x=5; print "2nd:",x[2]'
perl -e '$x=5; print "2nd: $x[2]n"'

Output:

2nd:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: 'int' object is unsubscriptable
2nd:

Python gives us its familiar elaborate complains, while Perl gives us its familiar laconic empty string. Its kindness and flexibility are such that in a numeric context, it would helpfully give us the number 0 – the empty string is what we get in a string context.

Is there any way to exceed the limits of Perl's patience? What about nonsensical operations – I dunno, concatenating a hash/dictionary/map/whatever you call it and a string?

python -c 'map={1:2,3:4}; print "map+abc:",map+"abc"'
perl -e '%map=(1,2,3,4); print "map+abc: ",%map . "abcn"'

Output:

map+abc:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'str'
map+abc: 2/8abc

Python doesn't like our operands but Perl retains its non-judgmental frame of mind. A Perl detractor could point out that silently converting %map to "2/8" (hash size/reserved space) is patently insane. A Perl aficionado could point out that Perl seems to be following Python's motto "Explicit is better than implicit" better than Python itself. In Python you can't tell the type of map at the point of usage. Perl code clearly states it's a hash with %, moreover . specifically means string concatenation (as opposed to +). So arguably you get what you asked for. Well, the one thing that is not debatable is that we still can't crash Perl.

OK, so Perl is happy with indexes which aren't and it is happy with arrays which aren't, and generally with variables of some type which aren't. What about variables that simply aren't?

python -c 'print "y:",y'
perl -e 'print "y: $yn"'

Output:

y:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'y' is not defined
y:

NameError vs that hallmark of tolerance, the empty string, a helpful default value for a variable never defined.

By the way, this is how $x[5] evaluates to an empty string when x isn't an array, I think. $x[5] is unrelated to the scalar variable $x, it looks for the array variable @x in another namespace. There's no @x so you get an empty array, having no 5th element so you get "". I think I understand it all, except for one thing: is there any way at all to disturb the divine serenity of this particular programming language?

python -c 'a=0; print 1/a'
perl -e '$a=0; print 1/$a'

This finally manages to produce:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
Illegal division by zero at -e line 1.

The second message about "illegal" division by zero (is there any other kind?) comes from our no longer tolerant friend, making me wonder. What is so special about division by zero? Why not be consistent with one's generally calm demeanor and return something useful like "" or 0? Would be perfectly reasonable – I did it myself, more accurately asked to have a hardware divider returning zero in these cases. Because there wasn't what hardware calls exception handling (having the processor jump to a handler in the middle of an instruction stream). We lived happily ever after, so what's wrong with 0?

But the real question is, what explains the stunning difference between Perl's and Python's character? Is it philosophical, "There's More Than One Way To Do It (TMTOWTDI)" vs "There should be one – and preferably only one – obvious way to do it; Although that way may not be obvious at first unless you're Dutch" (actual Perl and Python mottos, respectively)? The latter approach encourages to classify program behavior as "erroneous" where the former tends to instead assume you're knowingly doing something clever in Yet Another Possible Way, right?

Modernism vs Postmodernism, maybe, as outlined by Perl's author in "Perl, the first postmodern computer language"? "Perl is like the perfect butler. Whatever you ask Perl to do, it says `Very good, sir,' or `Very good, madam.' … Contrast that with the Modern idea of how a computer should behave. It's really rather patronizing: `I'm sorry Dave. I can't allow you to do that.'" The latter can be illustrated by Python's way of answering the wish of its many users to use braces rather than indentation for scoping:

>>> from __future__ import braces
SyntaxError: not a chance

So, "Very good, sir" vs "I can't allow you to do that". Makes sense with Python vs Perl, but what about, say, Lisp vs C++?

Lisp definitely has a "There's More Than One Way To Do It" motive in its culture. Look how much control flow facilities it has compared to Python – and on top of that people write flow macros, and generally "if you don't like the way built-in stuff works, you can fix it with macros", you know the drill. Guessing the user's intent in ambiguous situations? Perl says that it "uses the DWIM (that's "Do What I Mean") principle" for parsing, borrowing a term from Lisp environments. And yet:

(print (let ((x (make-array 3))) (aref x 5)))
*** - AREF: index 5 for #(NIL NIL NIL) is out of range
(print (let ((x 5)) (aref x 2)))
*** - AREF: argument 5 is not an array
(print (let ((m (make-hash-table))) (concatenate 'string m "def")))
*** - CONCATENATE: #S(HASH-TABLE :TEST FASTHASH-EQL) is not a SEQUENCE
(print y)
*** - EVAL: variable Y has no value
(print (/ 1 0))
*** - division by zero

5 out of 5, just like Python. Contrast that with C++ which definitely has a Bondage and Discipline culture, what with all the lengthy compiler error messages. Actually C++ would score 4 out of 5 on this test, but the test is a poor fit for statically typed languages. A more appropriate way to evaluate C++'s error handling approach would be to focus on errors only detectable at run time. The following message from The UNIX-HATERS Handbook records the reaction of a lisper upon his encounter with this approach:

Date: Mon, 8 Apr 91 11:29:56 PDT
From: Daniel Weise
To: UNIX-HATERS
Subject: From their cradle to our grave.

One reason why Unix programs are so fragile and unrobust is that C
coders are trained from infancy to make them that way. For example,
one of the first complete programs in Stroustrup’s C++ book (the
one after the “hello world” program, which, by the way, compiles
into a 300K image), is a program that performs inch-to-centimeter
and centimeter-to-inch conversion. The user indicates the unit of the
input by appending “i” for inches and “c” for centimeters. Here is
the outline of the program, written in true Unix and C style:

#include <stream.h>

main() {
  [declarations]
  cin >> x >> ch;
    ;; A design abortion.
    ;; This reads x, then reads ch.
  if (ch == 'i') [handle "i" case]
  else if (ch == 'c') [handle "c" case]
  else in = cm = 0;
    ;; That’s right, don’t report an error.
    ;; Just do something arbitrary.
[perform conversion] }

Thirteen pages later (page 31), an example is given that implements
arrays with indexes that range from n to m, instead of the usual 0 to
m. If the programmer gives an invalid index, the program just
blithely returns the first element of the array. Unix brain death forever!

You could say that the sarcasm in the Lisp-style comments proudly intermixed with C++ code is uncalled for since example programs are just that – example programs. As to the dreaded out-of-bound array access cited in the second example, well, C++ doesn't handle that to avoid run time overhead.

But the cited program didn't just ignore the range problem the way C would – it went to the trouble of checking the index and then helpfully returned the 0th element the way Perl would. Probably as one part of its illustration how in C++ you could have custom array types which aren't like C arrays. But why Postmodern Perl arrays, in a generally Disciplined language?

Well, it was 1991 and C++ exceptions were very young, likely younger than the cited example programs. (They've since aged but didn't improve to the point where most library users would be happy to have to catch them, hence many library writers aren't throwing them.)

Likewise, Perl had all the features used in the examples above before it had exceptions – or more accurately before it had them under the spotlight, if I understand this correctly. (In Perl you handle exceptions by putting code in an eval { … } block and then calls to the die() function jump to the end of that block, saving die's argument to $@ – instead of, well, dying. I think Perl had this relatively early; however it seems to only have become idiomatic after Perl 5's OO support and an option to send exception objects to $@, with people using die strictly for dying prior to that.) Perhaps Perl's helpful interpretations of obvious nonsense like $a["xxx"] aren't that helpful after all, but what would you rather have it do – die()?

AFAIK Python had exceptions under the spotlight from the beginning – although similarly to Perl it had exception strings before it had exception classes. And in fact it does its best to adhere to its "Errors should never pass silently" philosophy, the few deviations coming to mind having to do with Boolean contexts – the falsehood of empty strings, None and 0 together with None!=False/0==False/1==True/2!=True and similar gateways to pornographic programming.

Lisp has conditions and restarts which blow exceptions out of the water, hence its willingness to report errors isn't surprising. However, it gained these features in the 80s; what did previous dialects do? ERRORSET, which is similar to a try/catch block, appears to predate the current error handling system, but it doesn't seem to have been there from the very beginning, either. I'm not familiar with the Lisp fossil record, but there's a function for indexing lists called NTH which returns NIL given an out-of-bounds index. Lists definitely predate arrays, so I assume NTH likewise predates AREF which complains given a bad index. Perhaps NTH doesn't complain about bad indexes since it also predates ERRORSET and any other form of exception handling?

The pattern seems to be: if the language has exceptions, most of its features and libraries handle errors. If it doesn't, they don't; errors are covered up.

(Although I won't be surprised if I'm wrong about the Lisp history part because Lisp is generally much more thoughtful than I am  – just look at all the trouble Interlisp, by now an ancient dialect, went into in order to figure out whether a user wants to get an opportunity to fix an error manually or would rather have the program silently return to the topmost ERRORSET.)

awk and early PHP lack exceptions and are happy with out-of-bound array access. Java and Ruby have exceptions and you'll get one upon such access. It isn't just the culture. Or is it? Perl is PHP's father and awk is PHP's grandfather. *sh and make, which, like Perl, produce an empty string from $nosuchvar, aren't good examples, either – sh is Perl's mother and make is Perl's sister. Is it really the Unix lineage that is at fault as suggested by a dated message to a late mailing list?

Here's JavaScript, an offspring of Lisp:

>>> [1,2,3][5]+5
NaN
>>> [1,2,3][5]+"abc"
"undefinedabc"

I think this definitely rivals Perl. Apparently it's not the lineage that is the problem – and JavaScript didn't have exceptions during its first 2 years.

The thing is, errors are exceedingly gnarly to handle without exceptions. Unless you know what to do at the point where the error is detected, and you almost never know what to do at the point where the error is detected, you need to propagate a description of the error up the call chain. The higher a function sits up the call chain, the more kinds of errors it will have to propagate upwards.

(With a C++ background it can look like the big problem doesn't come from function calls but from operators and expressions which syntactically have nowhere to send their complaints. But a dynamic language could have those expressions evaluate to a special Error object just as easily as it can produce "" or "undefined". What this wouldn't solve is the need to clutter control flow somewhere down the road when deciding which control path to take or what should go to files or windows, now that every variable can have the value Error tainting all computations in its path.)

Different errors carry different meta-data with them – propagating error codes alone ought to be punishable by death. What good is a "no such file" error if I don't know which file it is? What good is a "network layer malfunction" error if I can't tell that in fact it's a "no such file" error at the network layer because a higher level lumps all error codes from a lower level into one code? (It always lumps them together since the layers have different lifecycles and the higher level can't be bothered to update its error code list every time the lower level does.)

Different meta-data means, importantly for static languages, different types of output objects from callee functions depending on the error, which can only be handled in an ugly fashion. But even if there's no such problem, you have to test every function call for errors. If a function didn't originally return error information but now does, you have to change all points of call.

Nobody ever does it.

It is to me an upper bound that apparently all programming languages without exception handling cover up errors at the language level.

A designer of a successful programming language, whatever you think of that language, is definitely an above average programmer, actually I think one that can safely be assumed to occupy the top tenth no matter how you rate programming ability (and really, how do you?). Moreover, the language designer is also in the top tenth if programmers are sorted by the extent of importance they assign to their work and motivation to get things right – because the problems are fundamental, because of the ego trip, because of everything.

And still, once decision or chance lead them into a situation where the language has no exceptions, they allow themselves to have errors covered up throughout their cherished system. Despite the obvious pain it causes them, as evident from the need for rationalizations ranging from runtime overhead (as if you couldn't have an explicit unsafe subset for that – see C#) to the Postmodern Butler argument ("Fuck you plenty? Very good, sir!").

What does this leave us to expect from average programs? The ones written in the order of execution, top to bottom, with the same intellectual effort that goes into driving from point A to point B? The ones not exposed to the kind of massive testing/review that could exterminate bugs in a flophouse?

This is why nothing inspires my trust in a program like a tendency to occasionally wet its pants with an error call stack. Leaving some exceptions uncaught proves the programmers are somewhat sloppy but I wouldn't guess otherwise anyway. What's more important is that because of exceptions along the road, the program most likely won't make it to the code where shooting me in the foot is implemented. And if a program never spills its guts this way, especially if it's the kind of program with lots of I/O going on, I'm certain that under the hood, it's busy silencing desperate screams: where there are no exceptions, there must be cover-up.

The C++ Sucks Series: petrifying functions

Your IT department intends to upgrade your OS and gives a developer a new image to play with. The developer is generally satisfied, except there's this one program that mysteriously dumps core. Someone thoughtfully blames differences in system libraries.

Alternative prelude: you have this program and you're working on a new version. Being generally satisfied with the updates, you send the code overseas. They build it and it mysteriously dumps core. Someone thoughtfully blames differences in the compiler version.

Whatever the prelude, you open the core dump with `gdb app core` and gdb says:

#0  0x080484c9 in main (argc=Cannot access memory at address 0xbf3e7a8c) at main.cpp:4
4    int main(int argc, char** argv)
(gdb)

Check out the garbage near "argc=" – if it ain't printing garbage, it ain't a C++ debugger. Anyway, it looks like the program didn't even enter main. An alert C++ hater will immediately suspect that the flying circus happening in C++ before main could be at fault, but in this case it isn't. In fact, a program can be similarly petrified by the perspective of entering any function, not necessarily main. It's main where it crashes in our example because the example is small; here's the source code:

#include <stdio.h>
#include "app.h"

int main(int argc, char** argv)
{
  if(argc != 2) {
    printf("please specify a profilen");
    return 1;
  }
  const char* profile = argv[1];
  Application app(profile);
  app.mainLoop();
}

On your machine, you run the program without any arguments and sure enough, it says "please specify a profile"; on this other machine, it just dumps core. Hmmm.

Now, I won't argue that C++ isn't a high-level object-oriented programming language since every book on the subject is careful to point out the opposite. Instead I'll argue that you can't get a first-rate user experience with this high-level object-oriented programming language if you don't also know assembly. And with the first-rate experience being the living hell that it is, few would willingly opt for a second-rate option.

For example, nothing at the source code level can explain how a program is so shocked by the necessity of running main that it dumps a core in its pants. On the other hand, here's what we get at the assembly level:

(gdb) p $pc
$1 = (void (*)(void)) 0x80484c9 <main+20>
(gdb) disass $pc
Dump of assembler code for function main:
0x080484b5 <main+0>:    lea    0x4(%esp),%ecx
0x080484b9 <main+4>:    and    $0xfffffff0,%esp
0x080484bc <main+7>:    pushl  -0x4(%ecx)
0x080484bf <main+10>:    push   %ebp
0x080484c0 <main+11>:    mov    %esp,%ebp
0x080484c2 <main+13>:    push   %ecx
0x080484c3 <main+14>:    sub    $0xa00024,%esp
0x080484c9 <main+20>:    mov    %ecx,-0xa0001c(%ebp)
# we don't care about code past $pc -
# a screenful of assembly elided

What this says is that the offending instruction is at the address main+20. As you'd expect with a Segmentation fault or a Bus error core dump, this points to an instruction accessing memory, specifically, the stack.

BTW I don't realy know the x86 assembly, but I can still read it thusly: "mov" can't just mean the tame RISC "move between registers" thing because then we wouldn't crash, so one operand must spell a memory address. Without remembering the source/destination order of the GNU assembler (which AFAIK is the opposite of the usual), I can tell that it's the second operand that is the memory operand because there's an integer constant which must mean an offset or something, and why would you need a constant to specify a register operand. Furthermore, I happen to remember that %ebp is the frame pointer register which means that it points into the stack, however I could figure it out from a previous instruction at main+11, which moves %esp [ought to be the stack pointer] to %ebp (or vice versa, as you could think without knowing the GNU operand ordering – but it would still mean that %ebp points into the stack.)

Which goes to show that you can read assembly while operating from a knowledge base that is not very dense, a way of saying "without really knowing what you're doing" – try that with C++ library code; but I digress. Now, why would we fail to access the stack? Could it have to do with the fact that we apparenty access it with the offset -0xa0001c, which ought to be unusually large? Let's have a look at the local variables, hoping that we can figure out the size of the stack main needs from their sizes. (Of course if the function used a Matrix class of the sort where the matrix is kept by value right there in a flat member array, looking at the named local variables mentioned in the program wouldn't be enough since the temporaries returned by overloaded operators would also have to be taken into account; luckily this isn't the case.)

(gdb) info locals
# if it ain't printing garbage, it ain't a C++ debugger:
profile = 0xb7fd9870 "U211?WVS??207"
app = Cannot access memory at address 0xbf3e7a98

We got two local variables; at least one must be huge then. (It can be worse in real life, main functions being perhaps the worst offenders, as many people are too arrogant to start with an Application class. Instead they have an InputParser and an OutputProducer and a Processor, which they proudly use in a neat 5-line main function – why wrap that in a class, 2 files in C++-land? Then they add an InputValidator, an OutputFormatConfigurator and a ProfileLoader, then less sophisticated people gradually add 20 to 100 locals for doing things right there in main, and then nobody wants to refactor the mess because of all the local variables you'd have to pass around; whereas an Application class with two hundred members, while disgusting, at least makes helper functions easy. But I digress again.)

(gdb) p sizeof profile
$2 = 4
(gdb) p sizeof app
$3 = 10485768

"10485768". The trouble with C++ debuggers is that they routinely print so much garbage due to memory corruption, debug information inadequacy and plain stupidity that their users are accustomed to automatically ignore most of their output without giving it much thought. In particular, large numbers with no apparent regularity in their digits are to a C++ programmer what "viagra" is to a spam filter: a sure clue that something was overwritten somewhere and the number shouldn't be trusted (I rarely do pair programming but I do lots of pair debugging and people explicitly shared this spam filtering heuristic with me).

However, in this case overwriting is unlikely since a sizeof is a compile time constant stored in the debug information and not in the program memory. We can see that the number will "make more sense" in hexadecimal (which is why hex is generally a good thing to look at before ignoring "garbage"):

(gdb) p /x sizeof app
$4 = 0xa00008

…Which is similar to our offset value, and confirms that we've been debugging a plain and simple stack overflow. Which would be easy to see in the case of a recursive function, or if the program crashed, say, in an attempt to access a large local array. However, in C++ it will crash near the beginning of a function long before the offending local variable is even declared, in an attempt to push the frame pointer or some such; I think I also saw it crash in naively-looking places further down the road, but I can't reproduce it.

Now we must find out which member of the Application class is the huge one, which is lots of fun when members are plentiful and deeply nested, which, with a typical Application class, they are. Some languages have reflection given which we could traverse the member tree automatically; incidentally, most of those languages don't dump core though. Anyway, in our case finding the problem is easy because I've made the example small.

(I also tried to make it ridiculous – do you tend to ridicule pedestrian code, including your own, sometimes as you type? Few do and the scarcity makes them very dear to me.)

class Application
{
 public:
  Application(const char* profile);
  void mainLoop();
 private:
  static const int MAX_BUF_SIZE = 1024;
  static const int MAX_PROF = 1024*10;
  const char* _profPath;
  char _parseBuf[MAX_BUF_SIZE][MAX_PROF];
  Profile* _profile;
};

This shows that it's _parseBuf that's causing the problem. This also answers the question of an alert C++ apologist regarding all of the above not being special to C++ but also relevant to C (when faced with a usability problem, C++ apologists like to ignore it and instead concentrate on assigning blame; if a problem reproduces in C, it's not C++'s fault according to their warped value systems.) Well, while one could write an equivalent C code causing a similar problem, one is unlikely to do so because C doesn't have a private keyword which to a first approximation does nothing but is advertised as an "encapsulation mechanism".

In other words, an average C programmer would have a createApplication function which would malloc an Application struct and all would be well since the huge _parseBuf wouldn't land on the stack. Of course an average C++ programmer, assuming he found someone to decipher the core dump for him as opposed to giving up on the OS upgrade or the overseas code upgrade, could also allocate the Application class dynamically, which would force him to change an unknown number of lines in the client code. Or he could change _parseBuf's type to std::vector, which would force him to change an unknown number of lines in the implementation code, depending on the nesting of function calls from Application. Alternatively the average C++ programmer could change _parseBuf to be a reference, new it in the constructor(s) and delete it in the destructor, assuming he can find someone who explains to him how to declare references to 2D arrays.

However, suppose you don't want to change code but instead would like to make old code run on the new machine – a perfectly legitimate desire independently of the quality of the code and its source language. The way to do it under Linux/tcsh is:

unlimit stacksize

Once this is done, the program should no longer dump core. `limit stacksize` would show you the original limit, which AFAIK will differ across Linux installations and sometimes will depend on the user (say, if you ssh to someone's desktop, you can get a lesser default stacksize limit and won't be able to run the wretched program). For example, on my wubi installation (Ubuntu for technophopes like myself who have a Windows machine, want a Linux, and hate the idea of fiddling with partitions), `limit stacksize` reports the value of 8M.

Which, as we've just seen, is tiny.