Yeah, naming conventions. Looks like my brain won't do any better today; those 5 drafts will have to wait. If you aren't in a mood for a trivial subject, skip this.

I think that the best naming convention out there is the Lisp one: case-insensitive-dash-separated. It just doesn't get any better:

  • You never have to hit Shift or Caps Lock in the middle of a name, which makes typing easier. This is especially important for names because you use auto-completion with names. Auto-completion requires you to press things like Ctrl-Space or Alt-Slash or Ctrl-P. Together with another Shift needed for the name to be completed correctly, auto-completion is much more likely to cause repetitive strain injury.
  • You never have to think about the case. Figuring out the case of a letter in a case-sensitive naming convention can be non-trivial; more on that later.
  • The dash-as-a-separator-convention is used in English, so your names look natural.

Unfortunately, most languages use C-style identifiers for names, the dreaded [A-Za-z_][A-Za-z_0-9]*, because their infix parsers can't tell a dash from a minus. So you can't use this convention.

This leads to two problems:

  1. How do we separate between subsequent words in an identifier?
  2. When do we capitalize letters?

Of course, we could use a lowercase_underscore_separated convention. It would solve both problems in a simple way, having all the benefits of the Lisp convention except for the no-Shifts-and-Caps-Locks part. But (1) Caps Lock is available for capital letters, but not for underscore, and Shift is less healthy for your hands, and (2) if we have case sensitivity in our language, we'll of course use it, won't we? OK, let's kill those underscores.

There are two anti-underscore schools: alllowercase and CamelCase. alllowercase looks lame – it makes it easy to know when to capitalize letters (never), but chooses to ignore the word separation problem completely. I used to sneer at it. However, it has two huge benefits: it's very typing-friendly, and it discourages the use of long names. Long names, people, are a frigging nightmare.

HaveYouEverSeenANameTakingHalfAScreen? This is awful. Awful!! I can't lock my eyes on the damned thing. I can only focus on a tiny part of it. My eyes nervously jump around the line, which mentions the moronic mega-identifier twice (at both parts of an assignment). I'm looking for differences, small differences in the names… You know, it could be BlahBlahABlah on the left and BlahBlahBBlah on the right… AAAARGH!

Reading this kind of code is pure mental pain. I prefer mental pain to physical pain on any day, and that's why I'm in the software industry, but still, this sucks. The good news are that alllowercasenametakinghalfascreen is so ridiculous that even the most clueless pseudo-orderly person won't emit it.

Now, CamelCase, which is basically the winner, because it's used in all major languages and libraries, is probably the worst possible naming convention. It fails to solve both problems created by the lack of a good word separator in the A-ZA-Z0-9 languages:

  1. You don't really know when one word ends and the next word starts.
  2. You don't really know when a letter should be capitalized.

The problems of camel case come from using capital letters for word separation. This interferes with the other uses of case in natural language. The problems are amplified by the brilliant idea to assign even more semantical payload to case: functionsLookLikeThis, but ClassesLookLikeThis, etc. Let's look at some examples.

English has words like TCP, DNA and WTF. Should a TCP socket class be called TCPSocket or TCPsocket? What about a TCPIPSocket? What if we need a tcpOpen method – should we call it TCPOpen, like a class, to preserve the natural case of an acronym, or should it be TCPopen, so that the lowercase "o" conveys the fact that it's a function?

Oh, I know, it should be "openTCP"! No, no, why are you using "openTcp" – this is ugliness for its own sake! The only important thing is to get the first letter of a name right, and then you can use natural capitalization! Unless, of course, it's "openTCPIPSocket", and then we have a problem again. "openTcpIpSocket"?.. Some people just can't handle it and resort to underscores: openTCP_IPsocket, open_TCP_IP_socket… It's no use. It's ugly no matter what you do.

Capital letters coming from the natural language, like those in acronyms and names of people, are the smaller part of the problem – Tcp looks ugly, but you know what it means. The other part of the problem is the capital letters coming from formal languages, such as mathematical notation.

For example, in computer vision it's common to denote 3D coordinates with uppercase X,Y,Z, and 2D coordinates with lowercase x,y. In a case-sensitive language, it's damn natural to follow this convention, and it works very well for a local variable X or x (including the case when you use both in the same function). It doesn't work so well when you try to name functions or classes after their arguments/coordinate systems.

Does xySomething start with a lowercase x because it's a function, or because it really accepts x values of 2D coordinates? What about xYSomething – is the Y capitalized because "y" is a word and we always capitalize the first letter of a word, or maybe the function expects Y values of 3D coordinates?

You can have a function working with 3D X coordinates and 2D y coordinates, you know. I think it's better to call it XySomething than xYSomething, because meaning is more important than convention. But did the author of the function think so, too? Of course, we can use an underscore to "clarify" the intent: something_Xy. The underscore clearly shows that the part after the underscore doesn't follow standard naming conventions, so it must be according to the computer-vision-specific convention.

So what happens is that CamelCase code deteriorates to the following state:

  1. You have ugly names like tcpIpOpen.
  2. Since you also have names like TCP_IP_Open, your real naming convention is "camel case with underscores". Which is equivalent to "any identifier that compiles".

Maybe there's a good way to augment CamelCase with rules that make it work well. I probably wouldn't know. I ought to say that I'm not that good at naming conventions in particular and in Best Practices in general. But I doubt there's a good case-sensitive naming convention out there.

Just look at the Python naming convention. You basically have everything. thingslikethis, ThingsLikeThis, things_like_this, thingsLikeThis, and they're all attached to different types of object (module, class, function, method). And every time your language entity convention disagrees with the common sense (class TCPIPSocket), you've got yourself an ugly name. And in a way, this is a good convention, because it at least tries to be consistent with the common conventions used in C, C++ and Java.

The annoying part of this is the slowdown. "Um, how should I spell this name?.." There are actual capitalization trade-offs here. Programming is almost exclusively about making decisions and choosing trade-offs. It's quite tiring, really. Nobody wants to be making some more pointless decisions on the way just for the fun of it. Maybe it's just me and the kind of people I've worked with, but I've always, always bumped into lots and lots of names which looked like a compromise. Somebody was thinking hard here. And it looks ugly anyway.


Code, data and interactive programming

"Are code and data the same thing?" I haven't conducted a poll, but I think the following answers are the most popular ones:

  • "What?" (I don't know about universal Turing machines)
  • "Sure!" (I know about universal Turing machines)

My answer is "No". I'll now try to explain briefly why my opinion makes sense in general. After that, I plan to get to the point, which is how the code/data distinction matters in interactive programming environments.

I think that, um, everything is data, at least everything I can think about. I mean, the only things that can technically enter my brain are either data or cigarette smoke, because I don't do drugs. And I hope that the effect of passive smoking is negligible, so it's just data.

In particular, code is data. But not all data is code. Code is a special kind of data, that looks like this:

  • There are lots of blocks.
  • Each block defines the value of something.
  • The blocks depend on each other, and the dependencies can be cyclic.

What this means, and of course everybody knows it, is that you can't make any sense of code in the general case. That is, the only way to compute the values defined by the blocks of code is to "run" the code – keep chasing the links between the blocks, computing the values they define as you go. You can't even prove that this process will terminate given an arbitrary bulk of code, not to mention proving its correctness.

Now, an image, for example, isn't code. Well, philosophically, it is, because if they show you an image and it's really ugly, you'll say "ewww". So the image was in fact a program giving instructions to your brain. The output of your brain's image compiler is the following code in the human body assembly language:

MOV R0, dev_mouth
MOV R1, disgust_string
JMP write
.asciz "ewww"

More interestingly, you can write a program that processes images, and this particular image may be the one that makes your program so confused that it never terminates. However, this doesn't mean that the image itself is "code". The image doesn't have interconnected blocks defining values. Even if the image is a screenshot of code.

An image is a two-dimensional array of pixels, a nice, regular data structure. You don't have to "run" it in order to do useful things with it, like computing its derivatives or e-mailing it to your friends so they'll go "ewww". And programs doing that can be proven to terminate, unless you have an infinitely slow connection to the outgoing mail server.

So what I'm saying is, code is a special kind of data, having blocks which define values and depend on each other. Does it really matter whether a particular piece of data is "code" according to this definition? I think it does. One reason is the above-mentioned fact that you can't really make sense of code. Many people realize the practical drawbacks of this, and so in many contexts, they use data-driven programming instead of the arguably more natural "code-driven" programming.

Everything you represent as "data" can be processed by many different programs, which is good. Everything you represent as "code" can only be processed by a kind of interpreter, which is bad. I'm not talking about the difficulty of parsing the syntax, which doesn't exist with Lisp or Forth, isn't a very big deal with C or Java and is a full-featured nightmare with C++ or Perl. I'm talking about the semantics – for many purposes, you can't really "understand" what you've parsed without running it, and this is common to all Turing-complete languages.

But this isn't going to be about the inability to analyze code. This is going to be about the somewhat more basic problem with code – that of blocks which point to each other. In order to explain what I mean, I'll use the example of 3 interactive programming environments – Matlab, Unix shells and Python, listed in decreasing order of quality (as interactive environments, not programming languages).

Interactive programming is the kind of programming where the stuff you define is kept around without much effort on your behalf. The other kind of programming is when you compile and run your code and it computes things and exits and they are gone. Clearly interactive programming is nicer, because it makes looking at data and trying out code on it easy.

Or so it should be; in practice, it looks like more people prefer "batch programming", so there might be some drawbacks in the actual interactive environments out there. What makes for a good interactive environment, and what spoils the fun? Let's look at some well-known gotchas with existing environments.

Some of the most upset people I've seen near computers were the ones that had a Matlab session running for months when their machine crashed. It turned out that they had a load of data there – measurements, results of heavy computations, symbolic equations and their solutions – and now it's all gone. GAAA!! This doesn't happen with batch programming that much, because you send the output of programs to persistent storage.

This problem, nasty as it may be, looks easy to fix – just have the system periodically save the workspace in the background. Perhaps Matlab already has this. I wouldn't know, because I tend to manually save things once in a few minutes, since my childhood trauma of losing a file I loved. Anyway, this doesn't look like an inherent problem of interactive computing, just an awfully common implementation problem. For example, do Unix shells, by default, save the command history of each separate concurrent session you run? I think you know the answer.

Speaking of Unix shells. Ever had the pleasure of typing "rm -rf *" in the wrong directory because of command completion from history? GAAA!! OK. Ought to calm down. Let's do Fault Analysis. Why did this happen? The command string with "rm" in it is, basically, code; shell code invokes processes. This code depends on another piece of code, the one that determines the current directory. The command string doesn't have a fixed meaning – you must run getcwd in order to figure it out.

The shell couldn't really warn us about the problem, either. That's because the meaning of "rm" is defined by the code at /bin/rm (or by some other program in the $PATH which happens to be called "rm"). Since the shell can't understand that code without running it, it doesn't have an estimation of the potential danger. And if the shell warned us about all commands completed from history that originally ran in a different directory than the current one, the completion would likely be more annoying than useful.

At some point I've got fed up with Unix shells, and attempted to switch to a Python shell. I tried IPython and pysh, and I still use IDLE at home on my XP box. I ought to say that Python shells suck, and I don't just mean "suck as a replacement for a Unix shell", but also "suck as a way to develop Python code". The single biggest problem is that when you change your code, you must reload modules. It's unclear which modules should be reloaded, there's no way to just reload everything, and ultimately you end up with a mix of old code and new code, which does something, but you aren't quite sure what exactly.

Die-hard Pythonistas refuse to acknowledge there's a problem, though they do bend their programming style to work around it. What they do is they write all of their code in one file, and use execfile instead of import to make sure everything is indeed redefined, the Zen of Python with its love of namespaces be damned. Sure, an interesting project in Python can be just 5000 lines worth of code, but I don't like to navigate in a file that big. And sometimes you do need more lines, you know.

Another thing they do is implement __repr__ in their classes so that print displays their objects, and they'll invest a lot of effort into making eval(repr(obj)) work. The fact that eval'able strings aren't necessarily the most readable way to produce debug prints doesn't seem to bother them. Nor do the contortions they have to go through to solve the prosaic problem of making references to other objects display reasonably. One way to do it is to use dictionary keys instead of pointers, so that member object references aren't expanded into a full object description when they are printed. If you don't know why they're doing this, you'll find their code fairly puzzling.

I find the struggle to make interactive Python programming work very depressing. It reminds me of me, before the invincible idiocy of C++ crushed my spirit. People have a tendency to assume that things are well thought-out and hence should work.

We have this extremely widespread language called C++, and it's centered around type-based static binding. And it's easy to see how this could help a compiler spot errors and optimize the code. Therefore, this programming style can be a good way of writing software, if applied consistently. Ha!

We have this Python language, and several shells for it. Quite obviously, interactive programming is a good way to speed up the development cycle. Therefore, adapting our Python code for interactive programming will pay off, if we do it consistently. Ha!

But I digress. This isn't about the trusting nature of software developers, nor is it a comparison between C++ and Python, mind you. They're hard to compare, since they are very different beasts: Python is a programming language, and C++ is a karmic punishment. So I should get back to the topic of interactive programming.

Here's my opinion on the example programming environments I used in this entry.

Matlab is a great one, unless you lose your workspace. I used it for a while several times and it just never itched, and nothing went wrong.

Unix shells are good in terms of their ability to preserve your data (everything is a flat, self-contained string of bytes). I'd love them if they didn't suck so badly as programming languages. Since they do, I only use shell scripting for one-shot throwaway things, like debugging (fiddling with log files and core dumps).

Python is awful. So when I'm on Unix, I run Python processes from the shell, and sometimes use Python's reflection to make my batch programming just a bit more interactive. For example, if you have a Python function f(a,b,c), you can have your command line parser introspecting its arguments and adding the command line options -a, -b and -c.

So much for specific examples. What's the generic rule? I think it's this: pointer-happy systems can't be interactive. That's because interactive programming is about saving your data objects. And this is only useful when the current value of a preserved object is clear to you. Otherwise, you can't use the object, so what's the point?

When you have pointers in your objects, the objects aren't self-contained, and when the pointed objects are redefined, it isn't clear what should happen with the pointing objects. Should they point to the new thing or the old thing? Either answer can be counter-intuitive to you, and the whole point of interactive programming is to let you enter a state of flow, and if you scratch your head and can't easily guess what the old object means right now, you aren't in a state of flow.

In particular, pointers to code are the worst kind of pointers, because code is the most intertwined data of your program, and a pointer to a single block of code basically points to the entire code base. When an object points to an old function, and the function was redefined, and the system keeps the old definition around, you may easily get a call sequence with both the new function and the old function, which is probably ridiculous. And if you make the old object point to the new function, the function might simply fail to work with that object, and you just can't tell whether it will work or not without running it, remember?

For example, Python is a good interactive calculator, because arithmetic expressions are self-contained. Even if they do contain references to variables, it's fairly clear what happens when you change a variable – all expressions mentioning it will now recompute differently. Note that arithmetic expressions aren't Turing-complete and can't have cyclic references. Now, if you use Python's object-oriented features, then you have objects which point to their class definition which is a bunch of code pointers, and now when you reload the module defining the class, what good are your old objects?

This is why I don't believe in Numeric Python. The whole point of using Python is to use its pointer-happy features, like classes and hairy data structures and dynamically defined functions and stuff. Numeric programming of the kind you do in Matlab tends to use flat, simple objects, which has the nice side-effect of making interactive programming work very well. If you use a numeric library inside a pointer-happy language like Python, quite soon the other libraries you use will make interactive programming annoying. So you'll either move to batch programming or suffer in denial like the die-hard Python weenie you are. Someone using Matlab will be better off, since interactive programming is more productive than batch programming, when it works well.

So at the bottom line, I think that interactive programming has limited applicability, since "general-purpose" programming environments pretty much have to be pointer-happy. That is, if a language doesn't make it very easy to create a huge intertwined mess of code and data pointers, I don't see how it can be usable outside of a fairly restricted domain. And even in the "flat" environments like Matlab or Unix, while old data objects can be useful, old commands are, and ought to be, a two-edged sword. Because they are code, and code is never self-contained and thus has a great potential to do the wrong thing when applied in a new context.

This whole claim is one of those things I'm not quite sure about. From my experience, it took me quite some time to realize which interactive features help me and which get in the way with each environment I tried. So I can't know what happens in Lisp or Smalltalk or Tcl or Excel or Emacs, in terms of (1) applicability to "general-purpose" tasks, (2) the amount of self-contained data compared to the kind with pointers, especially pointers to code and (3) the extent to which the thing is itchy and annoying at times. So comments are most welcome. In particular, if you know of an environment that, put simply, isn't more itchy than Matlab but isn't less general-purpose than Python, that would be very interesting.

The Algorithmic Virtual Machine

There's a very influential platform called the AVM, which stands for Algorithmic Virtual Machine. That's the imaginary device people use as their mental model of a computer. In particular, it's used by many people working on algorithms where performance matters. Performance matters in many different contexts, ranging from huge clusters processing astronomic amounts of data to modest applications running on pathetically weak hardware. However, I believe that the core architecture of the AVM is basically the same everywhere.

AVM application development is done using the ubiquitous AVM SDK – a whiteboard and a couple of hands for handwaving. An AVM application consists of a set of operations your algorithm needs executed. Each operation has a cost (typically one cycle, sometimes more). You can then estimate the run time of your algorithm by the clever technique of summing the cost of all operations.

These estimations are never close enough to the real run time. The definition of "close enough" varies; the quality of estimations, by and large, doesn't. That is, I claim that your handwavy AVM-derived estimation will fail to meet your precision requirements no matter what those requirements are. Apparently our tolerance for errors grows with the lack of understanding of the problem, but it never grows enough. But I'm not really sure about this theory; I'm only sure about AVM-estimations-suck part. Here's why.

The AVM is basically this imaginary machine that runs "operations". Here are some things that real machines must do, but the AVM doesn't:

  • Fetching instructions
  • Fetching operands
  • Testing for conditions
  • Storing results

Basically, the Algorithmic Virtual Machine developers concentrate on "operations" and ignore addressing, branches, caches, buses, registers, pipelines, and all those other gadgets which are needed in order to dispatch the operation. In fact, that's how I currently distinguish between people who write software to get a job done and people who think of software as their job. "People who program" are into operations (algebra, networking, AI); "programmers" are into dispatching (programming languages, operating systems, OO). This is about mental focus rather than aptitude. I haven't noticed that people of either group are inherently less productive than the other kind.

When they're after performance, the "operations" people will naturally look for a way to reduce the number of operations. Sometimes, they'll find an algorithm with a better asymptotic complexity – O(N+M) instead of O(N*M). At other times, they'll come up with a way to perform 4*N*M operations instead of 16*N*M. Both results are very significant – if M and N are the only variables. The trouble is that you can't see all the variables if you just look at the math (as in "we want to multiply and sum all these and then compare to that"). That way, you assume that you run on the AVM and leave out all the dispatching-related variables and get the wrong answer.

Is there a way to take the cost of dispatching into account? Not really, not without implementing your algorithm and measuring its performance. However, families of machines do have related sets of heuristics that can be used to guess the cost of running on them. For example, here are a couple of heuristics that I use for SIMD machines (they are relevant elsewhere, but their relative importance may drop):

  1. Bandwidth is costly.
  2. Addressing is costly.

These heuristics are vague, and I don't see a very good way to make them formal. Perhaps there isn't any. To show that my points have any formal significance, I'd have to formally prove that there's unavoidable intrinsic cost to some things no matter how you build your hardware. And I don't know how to go about that. So what I'll do is I'll give some examples to show what I mean, and leave it there.


Consider two "algorithms" (probably too fancy a name in this context): computing dot product, and computing its partial sums (Matlab: sum(a .* b) and cumsum(a .* b)). Exactly the same amount of "operations" – N multiplications and N additions. Many people with BA, MSc and PhD degrees in CS assume that the run time is going to be the same, too. It won't, because sum only produces one output, and cumsum produces N outputs. Worse, if the input vector elements are 8-bit integers, we probably need at least 32 bits for each output element. So we generate N*4 bytes of output from N*2 input bytes.

At this point, some people will say "Yeah, memory. Processors are fast, memories are slow, sure, memory is a problem". But it isn't just about the memory; memory bandwidth is just one kind of bandwidth. Let's look at the non-memory problems of the partial-sums-of-dot-product algorithm. On the way, I'll try to show how the "bandwidth costs" heuristic can be used to guess what your hardware can do and what the performance will be.

Consider a machine with a SIMD instruction set. Most likely, the machine has registers of fixed width (say, 16 bytes), and each instruction gets 2 inputs and produces 1 output. Why? Well, the hardware ought to support 2 inputs and 1 output to do basic math. Now, if it also wants to have an instruction that produces, say, 4 outputs, then it needs to have 3 additional output buses from the data processing units to the register file. It also needs a multiplexer so that each of the 4 outputs can be routed to each of its N registers (N can be 16 or 32 or even 128). The cost of multiplexers is, roughly, O(M*N), where M is the number of inputs and N is the number of outputs. That's awfully costly. Bandwidth costs. So they probably use 2 inputs and 1 output everywhere.

Now, suppose the machine has 16 multipliers, which is quite likely – 1 multiplier for each register byte, so we can multiply 16 pairs of bytes simultaneously. Does this mean that we can then take those 16 products and compute 16 new partial sums, all in the same cycle? Nope, because, among other things, we'd need a command producing 16×4 bytes to do that, and that's too much bandwidth. Are we likely to have a command that updates less than 16 accumulators? Yes, because that would speed up dot products, and dot products are very important; let's look at the manual.

You're likely to find a command updating – guess how many? – 4 accumulators (32 bits times 4 equals 16 bytes, that's exactly one machine register). If the register size is 8 bytes, you'll probably get a command updating 2 accumulators, and so on. Sometimes the machine uses "register pairs" for output; that doubles the register size for output bandwidth calculation purposes. The bottom line is that instruction set extensions can speed up dot product to an extent impossible for its partial sums. You might have noticed another problem here, that of the dependency of a partial sum on the previous partial sum. Removing this dependency doesn't solve the bandwidth problem. For example, consider the vertical projection of point-wise multiplication of 2 8-bit images, which has the same not-enough-accumulators problem.

There is little you can do about the bandwidth problem in the partial sums case – the algorithm is I/O bound. Some algorithms aren't, so you can optimize them to minimize the cost of bandwidth. For example, matrix multiplication is essentially lots of dot products. If you do those dot products straightforwardly, you'll have a loop spending 2 commands for loading the matrix elements into registers, and one command for multiplying and accumulating (MAC). 2 loads per MAC means an overhead of 200%.

However, you can work on blocks – 4 rows of matrix A and 4 columns of matrix B, and compute the 4×4=16 dot products in your loop. That's 4+4=8 loads per 16 MACs; the overhead dropped to 50%. If you have enough registers to do this. And it's still quite impressive overhead, isn't it? Your typical AVM user would be very disappointed. (Yes, some machines can parallelize the loads and the MACs, but some can't, and it's a toy example, and stop nitpicking). BTW, blocking can be used to save loads from main memory to cache just like we've used it to save loads from cache to registers.

OK. With partial sums of dot product, the bandwidth problem kills performance, and with matrix multiplication, it doesn't. What about convolution, which is about as basic as our previous examples? Gee, I really don't know. It's tricky, because with convolution, you need to store intermediate results somewhere, and it's unclear how many of them you're going to need. The optimal implementation depends on the quirks of the data processing units, the I/O, and the filter size. If you come across a benchmark showing the performance of convolution on some machine, you'll probably find interesting variations caused by the filter size.

So we have a bread-and-butter algorithm, and non-trivial & non-portable performance characteristics. I think it's one indication that your own less straightforward algorithm will also perform somewhat unpredictably. Unless you know an exact reason for the opposite.


Bandwidth is one problem with fetching operands and storing results. Another problem is figuring out where they go. In the case of registers, we have costly multiplexers for selecting the source and destination registers of instructions. In the case of memory, we have addresses. Computing addresses has a cost. Reading data from those addresses also has a cost. Some address sequences are costlier than others from one of these perspectives, or both.

The dumbest example is the misalignment problem. People who learned C on x86 are sometimes annoyed when they meet a PowerPC or an ARM or almost any other processor since it won't read a 32-bit integer from a misaligned address. So when you read a binary buffer from a file or a socket, you can't just cast the char* to an int* and expect it to work. Isn't it nice of x86 to properly handle these cases?

Maybe it's nice, maybe it isn't (at least if it failed, the code would be fixed to become legal C), but it sure is costly. The fact that it's "in the hardware" doesn't make it a single-cycle operation. If your address is misaligned, the 32 bits may reside in two different memory words (no matter what the word size is). The hardware will have to read the low word, and then read the high word, and then take the high bits of the low word and the low bits of the high word and make a single 32-bit value out of them. Because in one cycle, memories can only fetch one word from an aligned address.

Does it matter outside of I/O-related code using illegal pointer-casting? Consider the prosaic algorithm of computing the first derivative of a vector, spelled v(2:end)-v(1:end-1) in Matlab. If we run on a SIMD machine, we could execute several subtractions simultaneously. In order to do that, we need to fetch a word containing v[0]…v[15] and a word containing v[1]…v[16] (both zero-based). But the second word is misaligned. The handling of misalignment will have a cost, whether it's done in hardware or in software.

Well, at least the operands of subtraction live in subsequent addresses – 0,1,2…15 and 1,2,3…16. That's how data processing units like them: you read a pack of numbers from memory and feed them right into the array of adders, ready to crunch them. It's not always like that. Consider scaling: a(x) = b(s*x+t). This can be used to resize images (handy), or to play records at a different speed the way you'd do with a tape recorder (less handy, unless you like squeaky or growly voices).

Now, if s isn't integral (say, s=0.6), you'd have to fetch data from places such as s*x+t = 1.3, 1.9, 2.5, 3.1, 3.7... Suppose you want to use linear interpolation to approximate a(1.3) as a(1)*0.7+a(2)*0.3. So now we need to multiply the vector of "low" elements – a([1,1,2...]) – by the vector of weights – [0.7,0.1,0.5...] – and add the result to the similar product a([2,2,3...])*[0.3,0.9,0.5...]. The multiplications and the additions map nicely to SIMD instruction sets; the indexing doesn't, because you have those weird jumpy indexes. So this time, the addressing can become a real bottleneck because it can prevent you from using SIMD instructions altogether and serialize your entire computation.

Well, at least we access adjacent elements. This means that most memory accesses will hit the cache. When you bump into an element that isn't cached yet, the machine will bring a whole cache line (say, 32 bytes), and then you'll read the other elements in that cache line, so it will pay off. You can even issue cache prefetching instructions so that while you're working on the current cache line, the machine will read the next one in the background. That way, you'll hit the cache all the time, instead of having your processor repeatedly surprised (hey, I don't have a(32) in the cache!.. hey, I don't have a(64) in the cache!.. hey, I don't have…). Avoiding the regularly scheduled surprise can be really beneficial, although cache prefetching is truly disgusting (it's basically a very finicky kind of cooperative multi-tasking – you ought to stuff the prefetching commands into the exactly right spots in your code).

Now, consider a(x) = b(f(x)) – a generic transformation of an input vector given a function for computing the input coordinate from the output coordinate. We have no idea what the next address is going to be, do we? If the transformation is complicated enough, we're going to miss the cache a lot. By the way, if the transformation is in fact simple, and the compiler knows the transformation at compile time, the compiler is still very unlikely to generate optimal cache prefetching commands. Which is one of the gazillion differences between C++ templates and "machine-optimal" code.

DVMs and TVMs

My bandwidth and addressing heuristics don't model a real machine; they only model an upgrade to the AVM for SIMD machines. Multi-box computing is one example of an entire universe of considerations they fail to model. So what we got is a DVM – Domain-specific Virtual Machine.

Now, in order to estimate performance without measuring (which is necessary when you choose your optimizations – you just can't try all the different options), I recommend a TVM (Target-specific Virtual Machine). You get one as follows. You start with the AVM. This gives overly optimistic performance estimations. You then add the features needed to get a DVM. This gives overly pessimistic estimations.

Then, you ask some low-level-loving person: "What are the coolest features of this machine that other machines don't have?" This will give you the capabilities that the real processor has but its DVM doesn't have. For example, PowerPC with AltiVec extensions is basically a standard SIMD DVM plus vec_perm. I won't talk about vec_perm very much, but if you ever need to optimize for AltiVec, this is the one instruction you want to remember. It solves the indexing problem in the scaling example above, among other things. Using a SIMD DVM and forgetting about vec_perm would make AltiVec look worse than it really is, and some algorithms much more costly than they really are.

And this is how you get a TVM for your platform. The resulting mental model gives you a fairly realistic picture, second only to reading the entire manual and understanding the interactions of all the features (not that easy). And it definitely beats the AVM by… how do you estimate the quality of handwaving? OK, it beats the AVM by the factor of 5, on average. What, you want a proof? Just watch the hands go.

"High-level CPU": follow-up

This is a follow-up on the previous entry, the "high-level CPU" challenge. I'll try to summarize the replies and my opinion on the various proposals. But first, a summary of my original points:

  1. "Very" high-level languages have a cost. Attributing this cost to the underlying hardware architecture is wrong. You could move the cost from software to hardware, but that wouldn't eliminate it. I primarily referred to languages characterized by indirection levels and late binding of user-defined operations, such as Lisp and Python, and to a lesser extent/confidence to side-effect-free languages like Haskell. I didn't mean to say that high-level languages should not be used, in fact I think that their cost is wildly overestimated by many. However, denying the existence of any intrinsic cost guarantees that people will keep overestimating it, because if it weren't that high a cost, why would you lie to them? I mean it very seriously; horrible tech marketing is responsible for the death (or coma) of many great things.
  2. Of all systems with similar cost and features, the one that has the least stuff implemented in hardware is the best, because you can change more things. The idea that moving things to hardware is a sure way to make them efficient is a misconception. Hardware can't do "anything in one cycle"; there are many constraints involved. Therefore, it's better to let the software explicitly control a set of low-level components than build hardware logic implementing high-level interfaces to them. For example, to add 2 numbers on a RISC machine, you load them to registers, then add. You could have a command adding operands from memory; it wouldn't run faster, because the hardware would have to spend cycles on loading operands to (implicit) registers. Hardware doesn't have to be a RISC machine, but it's always better to move as much control to software as possible under the given system cost constraints.

I basically asked people to refute point 1 ("HLLs are costly"). What follows describes the attempts people made at it.

Computers you can't program

Several readers managed to ignore my references to specific high-level languages and used the opportunity to pimp hardware architectures that can't run those languages. Or any other programming languages designed for human beings, for that matter. Example architectures:

It is my opinion that the fans of this family of hardware/vaporware, consistent advocates of The New Age of Computing, have serious AI problems. Here's a sample quote on cellular automata: "I guess they really are like us." Well, if you want to build a computing device in order to have a relationship with it, maybe a cellular automaton will do the trick. Although I'd recommend to first check the fine selection of Homo Sapiens we have here on Planet Earth. Because those come with lots of features you'd like in a friend, a foe, a spouse or an employee already built-in, while computer hardware has a certain gap to fill in this department.

Me, I want to build machines to do stuff that someone "like us" wouldn't want to do, for any of the several reasons (the job is hard/boring/stinky/whatever). And once I've built them, I want people to be able to use them. Please note this last point. People and other "nature's computers", like animals and fungi, aren't supposed to be "used". In fact, all those systems spend a huge amount of resources to avoid being used. Machines aren't supposed to be like that. Machines are supposed to do what you want. Which means that both the designer and the user need to control them. Now, a computer that can't even be tricked into parsing HTML in a straightforward way doesn't look like it's built to be controlled, does it?

Let me supply you with an example: Prolog. Prolog is an order of magnitude more tame than a neural net (and two orders of magnitude compared to a cellular automaton) when it comes to "control" – you can implement HTML parsing with it. But Prolog does show alarming signs of independence – it spends most of its time in its inference engine, an elaborate mechanism running lengthy non-trivial loops, which sometimes turn out to be infinite. You aren't supposed to single-step those loops; you're supposed to specify truths about your world, and Prolog will derive more truths for you. Prolog was supposed to be the wave of the future about 25 years ago. I think it can be safely called dead by now, despite the fair amount of money poured into it. I think it died because it's extremely frustrating to use – you just can't tell why the hell it worked that way in each particular case. I've never seen anything remotely as annoying as Prolog, with the notable exception of Makefiles, running on top of a wonderful inference engine of their own.

My current opinion is that neural networks rarely deserve a special hardware implementation – if you need them, build a more traditional computer and run them on top of that; and cellular automata are just stillborn. I might be wrong in the sense that a hardware implementation of these models is the optimal solution for some problem, hence we'll see those beasts in some corner of a successful real-world system. But the vast majority of computing, including AI apps, will run on machines that support basic bread-and-butter programmer things simply and straightforwardly. Here's a Computing Technology Acceptance Lower Bound for ya: if you can't parse a frigging log file with it, you can't do anything with it.

Self-assembly computers

Our next contestant is a machine that you surely can program, once you've built it from the pieces which came in the box. Some people mentioned "FPGA", others failed to call it by its name (one comment mentioned a "giant hypercube of gates", for example). In this part, I'm talking about the suggestions to use an FPGA without further advice on exactly how it should be used; that is, FPGA as the architecture, not FPGA used to prototype an architecture.

Maybe people think that with an FPGA, "everything is possible", so in particular, you could easily build a processor efficiently implementing a HLL. Well, FPGA is just a way to implement hardware allowing you to trade NRE for unit cost. And with hardware, some things are possible and some aren't, or so I claim – for example, you can't magically make the cost of HLLs go away. If you can't think of a way to reduce the overhead HLLs impose on the system cost, citing FPGA doesn't make your argument look any better. On the contrary – you've saved NRE, but you've raised the cost of the hardware by the factor of 5.

Another angle: can you build a compiler? Probably so. Would you like to start your project with building a compiler? Probably not. Now, what makes people think that they want to build hardware themselves? I really don't know. Building hardware is gnarly, FPGA or not – there are lots of constraints you have to think about to make the thing efficient, and it's extremely easy to err on the side of not having enough flexibility. The latter typically happens because you try to implement overly high-level interfaces; it then turns out that you need the same low-level components to do something slightly different.

And changing hardware isn't quite as easy as changing software, even with FPGA, because hardware description code, with its massive parallelism and underlying synthesis constraints, is fairly tricky. FPGA is a perfectly legitimate platform for hardware vendors, but an awful interface for application programmers. If you deliver FPGAs, make it your implementation detail; giving it to application programmers isn't very likely to make them happy in the long run.

At the other end of the spectrum, there's the kind of "self-assembly computer" that reassembles itself automatically, "adapting to the user's needs". Even if it made any sense (and it doesn't), it still wouldn't answer the question: how should this magical hardware adapt to handle HLLs, for example, indirect memory access?

Actual computers designed to run HLLs

Some people mentioned actual hardware which was built to run HLLs, including Reduceron, Tcl on Board, Lisp Machines, Rekursiv, and ARM's Jazelle instruction set. For some reason, nobody mentioned Intel's 432, an object-oriented microprocessor which was supposed to replace x86, but was, among other things, too slow. This illustrates that the existence of a "high-level processor" doesn't mean that it was a good idea (of course it doesn't mean the opposite, either).

I'll now talk about these machines in increasing order of my confidence that the architecture doesn't remove the overhead posed by the HLL it's supposed to run.

  • Reduceron is designed to run Haskell, and focuses on an optimization problem I wasn't even aware of, that of graph reduction. One of the primary ideas seem to be that graph reduction doesn't suffer from dependency problems which could inhibit parallelization, but still can't be parallelized on stock CPUs. That's because a lot of memory access is involved, and there's typically little load/store bandwidth available to a CPU compared to its data processing capability. Well, I agree with this completely in the sense that memory access is the number one area where custom hardware design can help; more on that later. However, I'm not sure that the right way to go about it is to build a "Haskell Machine"; building a lower-level processor with lots of bandwidth available to it could be better. Then again, it could be worse, and my confidence level in this area is extremely low, which is why I list the Reduceron before the others: I think I'll look into this whole business some more. Pure functional languages are a weak spot of mine; for now, I can only say three things for sure: (1) side effects are a huge source of bugs, (2) although they get in the way of optimizers, side effects are a poor man's number one source of optimizations, so living without them isn't easy, and (3) the Reduceron is a pretty cool project.
  • Tcl on Board was built to run a Tcl dialect. Tcl doesn't pose optimization problems that languages like Lisp or Python do – it's largely a procedural language grinding flat objects. And there's another thing I ought to tell you: I don't like Tcl. However, I think that this Tcl chip is kind of insightful, because it's designed for low-end applications. And the single biggest win of having a "high-level" instruction set is to save space on program encoding. Several people mentioned it as a big deal; I don't think of it as a big deal, because instruction caches always worked great for me (~90% hits without any particular optimizations). However, for really small systems of the low-end embedded kind, program encoding is a real issue. I'm not saying that Tcl on Board is a good (or a bad) idea by itself; I know nothing about these things. I'm just saying that while I think high-level hardware will fail to deliver speed gains, it might give you space gains, so it may be the way to go for really small systems which aren't supposed to scale. Not that I know much about those systems, except that if I'd have to build one, I'd seriously consider Forth…
  • Lisp Machines ran Lisp, and Rekursiv ran LINGO, which apparently was somewhat similar to Smalltalk. This I know. What I don't know is how the hardware support for the high-level features would eliminate the cost overhead of the HLLs involved; that's because I don't know the architecture, and nobody gave much detail. I don't see a way to solve the fundamental problems. I mean, if I want to support arrays of bytes, then each byte must be tagged, doesn't it? And if I only support fixnums larger than bytes, then I'd waste space, right? And just what could the LispM do about the hairy binding done by CLOS behind the scenes? Again, this doesn't mean these machines weren't a good idea; in fact I wish my desktop hardware were more expensive and more secure, and tagged architectures could help. All I'm saying is that it would be more expensive. I think. I'd like to hear more about LispM, simply because most people who used it seem to be very fond of it – I know just one exception.
  • Jazelle is supposed to run Java. Java is significantly lower-level than Lisp or Smalltalk. It still is a beautiful example, because the hardware support in this case yields little performance benefits. In fact MIPS reported that a software implementation of JVM running on a MIPS core outperformed a JVM using Jazelle by a factor of about 2. I've never seen a refutation of that.

Stock computers with bells and whistles

Finally, there was a bunch of suggestions to add specific features to traditional processors.

  • Content-addressable memory is supposed to speed up associative array look-ups. There's a well-known aphorism by Alan Perlis – "A language that doesn't affect the way you think about programming is not worth knowing". Here's my attempt at an aphorism: "A processor that doesn't affect the way you access memory is not worth building". This makes the wide variety of tools designed to help you build a SIMD VLIW machine with your own data processing instructions uninteresting to me, and on the other hand, makes CAM quite appealing. I came to believe that your biggest problem isn't processing the data, it's fetching the data. I might talk about it some time; the Reduceron, essentially designed to solve a memory access problem preventing the optimization of a "perfectly parallelizable" algorithm, is one example of this. However, CAM goes way beyond providing more bandwidth or helping with the addressing – it adds comparison logic to each memory word. While it sounds impractical to replace all of your RAM with CAM, stashing a CAM array somewhere inside your system could help with some problems. Then again, it won't necessarily pay off – it depends on the exact details of what you're doing. All I can say at this point is that it's a Worthy Idea, which, for some reason, I keep forgetting about, and I shouldn't.
  • GC/reference counting optimizations. Maybe I'm wildly wrong, but I don't think the garbage is a big deal, 'cause how much time do you spend on garbage collection compared to plain malloc/free? The way I see it, the problem isn't so much with the overhead of garbage collection as it is with the amount of small objects allocated by the system and, most importantly, the amount of indirect memory accesses. I learned that some Lisp compilers can do object inlining with varying amounts of user intervention; well, when it works out, it removes the need for special hardware support. The thing is, I think the main battle here is to flatten objects, not to efficiently get rid of them. And I think that it's quite clearly software that should fight that battle.
  • Regular expression and string functions in hardware: I don't think it's worth the trouble, because how much time do you spend in regex matching anyway? Maybe it's because I don't process massive volumes of text, but when I do process the moderate amounts of text I bump into, there's the part where you store your findings in data structures, and I think it might be the bottleneck. And then a huge amount of data comes from places like RDBMSes where you don't have to parse much. You'd end up with idle silicon, quietly leaking power.

The good stuff

At the bottom line, there were two hardware-related things which captured my intoxicated imagination: the Reduceron and content-addressable memories. If anything ever materializes around this, I'll send out some samples. In the meanwhile – thanks!

The "high-level CPU" challenge

Do you love ("very") high-level languages? Like Lisp, Smalltalk, Python, Ruby? Or maybe Haskell, ML? I love high-level languages.

Do you think high-level languages would run fast if the stock hardware weren't "brain-damaged"/"built to run C"/"a von Neumann machine (instead of some other wonderful thing)"? You do think so? I have a challenge for you. I bet you'll be interested.


  • I work on the definition of custom instruction set processors (just finished one).
  • It's fairly high-end stuff (MHz/transistor count in the hundreds of millions).
  • I also work on the related programming languages (compilers, etc.).
  • Whenever application programmers have to deal with low-level issues of the machine I'm (partly) responsible for, I feel genuine shame. They should be doing their job; the machine details are my job. Feels like failure (even if "the state of the art" isn't any better).
  • …But, I'm also obsessed with performance. Because the apps which run on top of my stuff are ever-hungry, number-crunching real time monsters. Online computer vision. Loads of fun, and loads of processing that would make a "classic" DSP hacker's eyeballs pop out of his skull.

My challenge is this. If you think that you know how hardware and/or compilers should be designed to support HLLs, why don't you actually tell us about it, instead of briefly mentioning it? Requirement: your architecture should allow to run HLL code much faster than a compiler emitting something like RISC instructions, without significant physical size penalties. In other words, if I have so many square millimeters of silicon, and I pad it with your cores instead of, say, MIPS cores, I'll be able to implement my apps in a much more high-level fashion without losing much performance (25% sounds like a reasonable upper bound). Bonus points for intrinsic support for vectorized low-precision computations.

If your architecture meets these requirements, I'll consider a physical implementation very seriously (because we could use that kind of thing), and if it works out, you'll get a chip so you can show people what your ideas look like. I can't promise anything, because, as usual, there are more forces at play than the theoretical technical virtue of an idea. I can only promise to publicly announce that your idea was awesome and I'd love to implement it; not much, but it's the best I can deliver.

If you can't think of anything, then your consistent assertions about "stupid hardware" are a stupid bluff. Do us a favor and shut up. WARNING: I can't do hardware myself, but there are lots of brilliant hardware hackers around me, and I've seen how actual chips are made and what your constraints are. Don't bullshit me, buddy.

Seriously, I'm sick and tired of HLL weenie trash talk. Especially when it comes from apparently credible and exceedingly competent people.

Alan Kay, the inventor of Smalltalk: "Just as an aside, to give you an interesting benchmark—on roughly the same system, roughly optimized the same way, a benchmark from 1979 at Xerox PARC runs only 50 times faster today. Moore’s law has given us somewhere between 40,000 and 60,000 times improvement in that time. So there’s approximately a factor of 1,000 in efficiency that has been lost by bad CPU architectures." … "We’re not going to worry about whether we can compile it into a von Neumann computer or not, and we will make the microcode do whatever we need to get around these inefficiencies because a lot of the inefficiencies are just putting stuff on obsolete hardware architectures."

Jamie Zawinski, an author of Mozilla: "In a large application, a good garbage collector is more efficient than malloc/free." … "Don't blame the concept of GC just because you've never seen a good GC that interfaces well with your favorite language." Elsewhere: "it's a misconception that lisp is, by its nature, slow, or even slower than C" … "if you're doing a *big* project in C or C++, well, you're going to end up reinventing most of the lisp runtime anyway"

Steve Yegge, a great tech blogger: "The von Neumann machine is a convenient, cost-effective, 1950s realization of a Turing Machine, which is a famous abstract model for performing computations." … "There are various other kinds of computers, such as convenient realizations of neural networks or cellular automata, but they're nowhere as popular either, at least not yet". And… "The Von Neumann architecture is not the only one out there, nor is it going to last much longer (in the grand 400-year scheme of things.)"

Wow. Sounds dazzling and mind-opening, doesn't it? Except there isn't any technical detail whatsoever. I mean, it's important to be open-minded and stuff. It really is. The fact that something doesn't seem "practical" doesn't mean you shouldn't think or talk about it. But if something isn't even a something, just a vague idea about Awesome Coolness, it poisons the readers' minds, people. It's like talking about Spirituality of the kind that lets you jump over cliffs at your mighty will or something (I'm not that good at New Age, but I think they have things like these in stock). This can only lead to three results:

  1. Your reader ignores you.
  2. Your reader sits on a couch and waits to gain enough Spirituality to jump around cliffs. Congratulations! Your writing has got you one fat fanboy.
  3. Your reader assumes he's Spiritual enough already and jumps off a cliff, so you've got a slim fanboy corpse.

It's the same with this Great High-Level Hardware talk. I can ignore it, or I can wait forever until it emerges, or I can miserably fail trying to do it myself. Seriously, let's look at these claims a little closer.

Alan Kay mentions a benchmark showing how lame our CPUs are. I'd really like to see that benchmark. Because I've checked out the B5000 which he praised in that article. And I don't think a modern implementation of that architecture would beat a modern CPU in terms of raw efficiency. You see, RISC happened for a reason. Very roughly, it's like this:

  • You can access memories at single cycle throughput.
  • You can process operands in registers at single cycle throughput.
  • And that's pretty much what you can do.

Suppose you want to support strings and have a string comparison instruction. You might think that "it's done in the hardware", so it's blindingly fast. It isn't, because the hardware still has to access memory, one word per cycle. A superscalar/VLIW assembly loop would run just as quickly; the only thing you'd save is a few bytes for instruction encoding. On the other hand, your string comparison thingie has got you into several sorts of trouble:

  • Your machine is larger, with little gain – you don't compare strings most of the time.
  • Your machine is complicated, so optimizing the hardware is trickier.
  • Compilers have trouble actually utilizing your instructions.
  • Especially as the underlying hardware implementation grows more complicated and the performance of assembly code gets harder to model.

When people were supposed to write assembly programs, the inclusion of complicated high-level instructions was somewhat natural. When it became clear that compilers write most of the programs (because compilation became cheap enough), processors became less high-level; the points above hopefully explain why.

And don't get me started about the tagging of data words. B5000 had polymorphic microcode – it would load two words, look at their type bits and add them according to the run time types. Well, B5000 didn't support things like unsigned 8-bit integers, which happen to be something I need, because that's how you store images, for example. Am I supposed to carry tag bits in each frigging byte? Let me point out that it has its cost. And I don't think this sort of low-level polymorphism dwarfs the cost of Lisp or Smalltalk-style dynamic binding, either (B5000 was designed to run Algol; what would you do to run Smalltalk?)

There's another angle to it: Alan Kay mentions that you almost couldn't crash the B5000, which suited the business apps it was supposed to run quite well. I think that's just awesome, I really do (I shoveled through lots of core dumps). In fact, I think people who implemented modern desktop operating systems and web browsers in unsafe languages on top of unsafe hardware are directly responsible for the vast majority of actual security problems out there. But (1) in many systems, the performance is really really important and (2) I think that security in software, the way it's done in JVM or .NET, still has lower overall cost than tagging every byte (I'm less sure about part 2 because I don't really know the guts of those VMs). Anyway, I think that hardware-enforced safety is costly, and you ought to acknowledge it (or really show why this fairly intuitive assumption is wrong, that is, delve into the details).

JWZ's Lisp-can-be-efficient-on-stock-hardware claim isn't much better than Smalltalk-can-be-efficient-on-custom-hardware, I find. Just how can it be? If you use Lisp's static annotation system, your code becomes uglier than Java, and much less safe (I don't think Lisp does static checking of parameter types, it just goes ahead and passes you an object and lets you think it's an integer). If you use Lisp in the Lispy way that makes it so attractive in the first place, how on Earth can you optimize out the dynamic type checks and binding? You'd have to solve undecidable problems to make sense of the data flow. "A large project in C would implement the Lisp run time?" Oh really? You mean each variable will have the type LispObject (or PyObject or whatever)? Never happens, unless the C code is written by a deeply disturbed Lisp weenie (gcc and especially BetaPlayer, I'm talking about you). The fact that some people write C code as if they were a Lisp back-end is their personal problem, nothing more, nothing less.

The dynamic memory allocation business is no picnic, either. I won't argue that garbage collection is significantly less efficient than manual malloc/free calls, because I'm not so sure about it. What I will argue is that a good Lisp program will use much more dynamic allocation and indirection levels than a good C program (again, I ignore the case of emulating C in Lisp, or Lisp in C, because I think it's a waste of time anyway). And if you want to make your objects flat, I think you need a static type system, so you won't be much higher-level than Java in terms of dynamic flexibility. And levels of indirection are extremely costly because every data-dependent memory access is awfully likely to introduce pipeline stalls.

Pure functional languages with static typing have their own problem – they lack side effects and make lots of copies at the interface level; eliminating those copies is left as an exercise to the compiler writer. I've never worked through a significant array of such exercises, so I won't argue about the problems of that. I'll just mention that static typing (irregardless of the type inference technique) characterizes lower-level languages, because now I have to think about types, just the way imperative programming is lower-level than functional programming, because now I have to think about the order of side effects. You can tell me that I don't know what "high-level" means; I won't care.

Now, the von Neumann machine business. Do you realize the extent to which memory arrays are optimized and standardized today? It's nowhere near what happens with CPUs. There are lots of CPU families running lots of different instruction sets. All memories just load and store. Both static RAM (the expensive and fast kind) and dynamic RAM (the cheap and slower kind) are optimized to death, from raw performance to factory testing needed to detect manufacturing defects. You don't think about memories when you design hardware, just the way you don't think about the kind of floating point you want to use in your numeric app – you go for IEEE because so much intellectual effort was invested in it on all levels to make it work well.

But let's go with the New Age flow of "von Neumann machine is a relic from the fifties". What kinds of other architectures are there, and how do you program them, may I ask? "C is for von Neumann machines". Well, so is Java and so is Lisp; all have contiguous arrays. Linked lists and dictionaries aren't designed for any other kind of machine, either; in fact lots of standard big O complexity analysis assumes a von Neumann machine – O(1) random access.

And suppose you're willing to drop standard memories and standard programming languages and standard complexity analysis. I don't think you're a crackpot, I really don't; I think you're bluffing, most probably, but you could be a brilliant and creative individual. I sincerely don't think that anything practiced by millions can automatically be considered "true" or "right"; I was born in the Soviet Union, so I know all about it. Anyway, I want to hear your ideas. I have images. I must process those images and find stuff in them. I need to write a program and control its behavior. You know, the usual edit-run-debug-swear cycle. What model do you propose to use? Don't just say "neural nets". Let's hear some details about hardware architecture.

I really want to know. I assume that an opinion held by quite some celebrities is shared by lots and lots of people out there. Many of you are competent programmers, some stronger than myself. Tell me why I'm wrong. I'll send you a sample chip. I'll publicly admit I was a smug misinformed dumbass. Whatever you like. I want to close this part of "efficient/high-level isn't a trade-off" nonsense, so that I can go back to my scheduled ranting about the other part. You know, when I poke fun at C++ programmers who think STL is "high-level" (ha!). But until this "Lisp is efficient" (ha!) issue lingers, I just can't go on ranting with clear conscience. Unbalanced ranting is evil, don't you think?

Everybody agrees with yosefk

Alan Kay on programming languages as user interfaces: "Even if you’re designing for professional programmers, in the end your programming language is basically a user-interface design. You will get much better results regardless of what you’re trying to do if you think of it as a user-interface design."

Steve Yegge on the relative difficulty of high-level and low-level programming: "Observation: systems programmers look down on application programmers. Pop culture is that systems programming (kernels, drivers, real-time OSes, etc.) is harder, possibly since they're equating app programming with just laying out the UI. I've done both — or all three, counting doing a bunch of tedious UI layouts in various frameworks. App-level programming is harder than systems programming."

m4 manual page on Turing tar pit addictiveness: "Some people found m4 to be fairly addictive. They first use m4 for simple problems, then take bigger and bigger challenges, learning how to write complex m4 sets of macros along the way. Once really addicted, users pursue writing of sophisticated m4 applications even to solve simple problems, devoting more time debugging their m4 scripts than doing real work. Beware that m4 may be dangerous for the health of compulsive programmers."

Fun at the Turing tar pit

I love to babble about "social issues of software development". I try to avoid mentioning them in technical discussions (also known as "flame wars") because it makes a particularly lousy argument. Other than that, I just love talking about it. Face it: software is written by people for people. Most of it smells like a hairless pink ape. Nothing formal or provable/refutable or scientific or universally correct about it. People stuff. Like furniture, or jewelry, or porn, or screwdrivers. The machine merely follows orders, the humans are the serious players. Social Issues are what makes those bits move.

So, Social Issues. Today's Social Issue is this: programmers adore Turing tar pits. Turing tar pits are addictive. You can render programmers completely paralyzed from any practical standpoint by showing them a tar pit and convincing them to jump into it.

A Turing tar pit is a Turing-complete virtual machine that doesn't support straightforward implementation of bread-and-butter programmer stuff. Bread-and-butter programmer stuff includes arithmetics, variables, loops, data structures, functions and modules and other such trivial and awfully handy goodies. A canonical example of a Turing tar pit is the Brainfuck programming language. Implementing a decimal calculator or a sokoban game in sed is also a consensus example of Turing tar pit swimming as far as I know. By "consensus", I mean that nobody writes software that people are actually supposed to use on top of those VMs.

Some Turing tar pits are used as vehicles for production code.

C++ templates. A crippled everything-is-a-type VM. Try writing a compile-time loop using them; then, try to implement a compile-time hash table. I find the implementation of a decimal calculator in sed more concise in certain aspects. But people do awfully hairy things with C++ templates, like boost. The fun of implementing such things can only be compared to the fun of using them (error messages which would take a whole tree to print out on paper and such).

TeX. A crippled everything-is-a-text-transformation VM. A great back-end with a nightmarish front-end, as far as I can tell. You get neither WYSIWYG nor the ability to mechanically analyze the content (the document is basically code which transforms itself, and transforms itself, and transforms itself, and you can't make any sense of the document without, um, running it). But people do unimaginably hairy things on top of TeX, like LaTeX. You actually have to debug LaTeX documents, and I bet that the best TeX debugger is somewhat worse than the worst C debugger out there.

Shell scripts. A crippled everything-is-a-process-unless-it's-a-string VM. Debugging clearly wasn't on the list of Useful Activities which the shell was meant to support. "rm: Is a directory". Gotta love that. Of course people write huge shell scripts. One excuse is that they want to modify env vars in the user's shell, and "only a shell script can do that" (wrong – you can use eval `sane-prog` instead of source insane-shell-script). A particularly awesome kind of shell script is the multi-kilobyte one-liner (find | grep | awk | grep -v | fuck-off-and-die).

I'll tell you why it happens. When you write code in a full-featured programming language, clearly you can do a lot of useful things. Because, like, everybody can. So the job has to be quite special to give you satisfaction; if the task is prosaic, and most of them are, there's little pride you're going to feel. But if the language is crippled, it's a whole different matter. "Look, a loop using templates!" Trivial stuff becomes an achievement, which feels good. I like feeling good. Templates are powerful! What do you mean by "solving a non-problem?" Of course I'm solving real problems! How else are you going to do compile-time loops in C++? How else are you going to modify env vars of the parent shell?! I'm using my proficiency with my tools and knowledge of their advanced features to solve problems! Damn, I'm good!

By the way, here's another entry in The Modern Software Industry Dictionary:


An attribute of programming environments or their individual features, most often Turing tar pits. A feature is "powerful" when at least one of the following holds:

  1. It can be used to implement something trivial in an pointlessly complicated way.
  2. It can cause a lot of damage.

Seriously, it seems like 85% percents of the contexts where something is called "powerful", it really means "useless and dangerous". Unlike most entries in the Modern Software Industry Dictionary, I don't consider this word a meaningless cheerleader noise. I think it actually carries semantics, making it a pretty good warning sign.

Back to our subject. A Turing tar pit deployed on a massive scale has two effects on programmers who got addicted to it:

  1. They waste an awful lot of time (ab)using it. Pretty obvious? Wait, there's more!
  2. They come to think that the sort of work they're used to accomplish with a Turing tar pit should always be done using a Turing tar pit. And this is just depressing. It really is.

Lisp-style macros, D-style mixins and code generation using DSL compilers are some ways to do compile-time things in a not entirely brain-crippled fashion. You know, you can actually concatenate two strings, which is unimaginable luxury in the world of C++ templates. People are actually afraid of these things. These people aren't cowards, mind you. These are the same people who'd fearlessly delve into the darkest caves of template metaprogramming. Sure, it's "non-trivial", but sometimes you need the Powerful Features to get your job done! But a full-blown programming language for compile-time work? Are you kidding? That is just too dangerous and unmaintainable! And anyway, you can do everything with templates – they're Turing-complete! Neat, ain't it?!

Alternatives to TeX… I'm bad at this, I really am, so you won't get a detailed rant on this subject. Let's just say that WYSIWYG is underestimated by many programmers, and extending a wiki engine, even one written in PHP, beats extending TeX hands down. This isn't a first-hand evidence, just generic Turing-tar-pit intuition. This is the part where you can tell me that I'm a moron and get away without a symmetrical remark. In particular, TeX seems to have awfully good back-ends; I don't think it can possibly justify heavy usage of its front-end macro facilities, but, um, everything is possible.

Scripting. Back to yosefk's-perceived-competence-land. When people hear about a new language, and they're told that it's a scripting language, not only don't they treat it seriously, they insist on emulating a Turing tar pit in it. They call system() and pass it incomprehensible 300-character pipelines. They write everything in one big bulk, without using functions or classes. They ignore data structures and regular expressions and call awk when they need them the way they'd do in a shell script. From Python or Ruby or Perl. This is how you write scripts, you know.

Interesting, isn't it? I find it fascinating. I think that software is to wetware what leaves are to trunks; to figure out the leaves, you have to saw through the trunk and see what's inside, or something. I also think that crappy software isn't necessarily written by dumb people, although that helps. I think that the attitude and the state of mind is at least as important as raw aptitude, much like the direction at which you aim a cannon isn't less important than its range. I'll definitely share those thoughts with you, as well as many others, but that will have to wait for another time. It's been a pleasure having you with us. Good night.

Interrupt? Let the Bastard handle it!

I've promised to consider publishing an objective and comprehensive coverage of the adventures of The Bastard. The public interest generated by this promise, combined with an optimistic estimation of the legal risks involved, lead me to the decision to Yes, publish it.

Myself, I always kinda missed this sort of thing. We've got Bastard Operator From Hell, but what about a Bastard Programmer? Sure, we have worsethanfailure.com, but it's primarily about incompetence. What about The Pure Evil? Of course, compared to a programmer, an evil sysadm has much more interesting opportunities to practice his evil craft, for he's got root access, and what have you got, my poor helpless protagonist? You ain't got nothing! With a programmer, you can at least read and change the code. Well, actually you can't, but you can try. The point is that someone unfamiliar with the subject won't see that the protagonist was doomed all along. But this excuse doesn't excuse anything. Isn't the struggle between Good and Evil the whole point of drama, and isn't it much more interesting when Good is given a chance to rise from the ashes and kick Evil's butt?

Enough introductions. Ladies and gentlemen, Proper Fixation is proud to present you The Adventures Of The Bastard, Part I!

Disclaimer: The story is written in first person; think of The Real Me as a second person. Similarly, the Bastard is a fictional character, though he is based on several different non-fictional, full-fledged real-life bastards. Function names and hexadecimal addresses have been changed to protect the guilty.

We've got this new board, which apparently works (no signs of smoke). Now I'm supposed to run a little app on this board to try it out. Not the full-blown thing, just a small part of it. Um, did I just say "small"? Let me see… 4 megabytes. What the hell is going on? Let's look at the symbol table, shall we? __i5_stdZQostream_put. Mmmm, crunchy. Wait a minute… Pipe through awk, get the symbol names, pipe through sort. Well, what do you know. Apparently one __i5_stdZQostream_put is not enough. No, sir, we need about… I could pipe it through awk again and make a histogram… Oh, screw that. We need about 5, 10, 15… about 50 of those. Groo-vy. C++. Surpassing the safety of C and the memory efficiency of Java. What's the word for "surpass from below"? I don't quite remember. Maybe "suck"? Oh, Useful Application, why are you not written in some other language? I bet you don't know, either. Oh well, enough talking. I should run along now… More accurately, you should run, right here on this board. Where's my JTAG probe?

JTAG probes. Awesome gadgets. You just grab this thing by the pins and shove it right into your board. And then you tell your chip, "HALT!!". And it halts. Well, actually, only the processor halts, and the other pieces of hardware keep happily crunching away. But at least the processor halts. For real. No "background processes", no nothing. Stops right there on the spot. None of this Can-I-pleeease-ptrace-that-process and No-you-have-no-permissions kind of nonsense. When I say HALT, everything HALTS! And then I can put my dirty little hands on each and every frigging bit of this massive, steaming pile of electronic devices. If the bit in question is memory-mapped, of course. Which it isn't, not when you most need it… But at least there's none of the so-called "memory protection" bullshit. Well, actually, we do have some of that. The debuggers are so dumb that if your processor has some kind of a retarded memory management unit and you have the code memory marked as read-only, it can't place breakpoints there. It can't even load code, for that matter. But we can manually overwrite the permission bits, can't we? Yes, we can!

We. Me and my probe. For the life of me, I couldn't do that alone. Oh, I miss my probe so much. Nothing personal, I just need to run the 4 megs on the board, you know. How am I supposed to load it, bit-by-bit with my bare hands, I ask thee? Where's my frigging probe, dammit? Oh, there it is. OK, go ahead and grab a $10K worth of hardware from my desk, right by the pins, if you really have to. But while you're at it, why not leave a note where you say goodbye and wish me the best of luck on behalf of the probe, if you know what I mean? All right. Careful next time.

telnet darth. Shut up. I didn't register these stupid probes in the DNS. The Bastard did. We also have a luke, but luke doesn't work with multiple JTAG targets in the same chip, do you, luke? Luke, when gone am I, the last of the Jedi will you be. No answer. Probably because it's unplugged. Then again, there are other reasons. I need a vacation. What does darth have to tell us? Looks like someone has played with the frequency. Let's raise it back. We have a whole lot of __i5_stdZQostream_put to upload. Must be careful though. What was the magic maximal JTAG frequency, 20% of the target clock speed? Not much, considering the blazing 100% speed of this embedded mircocrud and its ilk. And JTAG, my friends, transfers bit-by-bit. OK, time to quit whining. We're up and running, and pumping those bits like crazy. Yawn. This is slow. 87%-93%-99%… Finally.

Run, run, run, CRASH. Awesome. It's a good thing we've got ourselves a Graphical Debugger. It's also a good thing that I can't buy weapons without a license where I live. Or else you'd witness Graphical Violence right now. Because the stack is quite expectedly smashed to little bits, and the global variables, my only hope to make any sense of this, are defined in C++ namespaces. And the namespace support in our Graphical Debugger is really the way to go. If you want to go postal, that is. Basically, it boils down to three options:

1. view Stupid::g_moronic doesn't work, but double-clicking at the point of definition does.
2. view Stupid::g_moronic does work, but double-clicking at the point of definition doesn't.
3. view Stupid::g_moronic doesn't work, and neither does the double-clicking.

If you wonder how the particular sort of wrong behavior is selected by the bleeding-edge tool, the only thing I can tell you is that you've got company. I wonder how it chooses the way to screw me, too. Well, actually, I stopped wondering long ago, and what I do is I try to click, no it's not option 1, I try to view, no it's not option 2, I say "FUCK!!" because it's option 3, the one where nothing works, so I filter nm through grep g_moronic, find the address and do a view *(MoronicClass*)0x70066584, et voila! The pile'o'crap is visualized in all its glory right in front of our aching, grepping eyes. You know what I like about global variables? Even C++ can't completely fuck them up; you still find them. And you know what I don't like? I don't like the value of _area here, 'cause that's an impossible value. Rectangles don't have negative areas, unless they are imaginary rectangles. Why do you imagine rectangles, you fatty 4-meg piece of garbage? Why not run properly instead? Maybe some running could help you get back into shape, loose a couple of those kilobytes or something.

Who crapped all over my _area? Let's browse the lvalue references, which takes a right click and a menu item selection. This debugger isn't at all bad, I'm telling ya. If it wasn't for those cplusplusisms in the code… Here's my lvalue ref list. Click, click, click. All places look sensible, as far as these things go. Unless of course when width is multiplied by height over here, one of them is negative. I could keep backtracking, but I've clicked enough. Let's add some good old-fashioned printf statements.

Compiiiiile. Liiiiink. Ruuuuun. Yawwwwwn. Nothing like the C++, NFS and JTAG trinity to put me into this caaaaalm mood. Wait, what's that?.. SHIT! All that forced meditation for nothing. It crashes differently this time. OK, put yourself together. It's no big deal. Been there many times. Shit happens. You know, uninitialized values which are uninitialized differently every time, race conditions and who knows what else. Let's print some more stuff.

Edit-compile-link-run-yawn-SHIT. Edit-compile-link-run-yawn-SHIT. Edit-compile-link-run-yawn-SHIT. Crash, crash, crash. It prints nonsense alongside fairly reasonable stuff, and crashes. Wait a second. Crashes? At an instruction comparing two registers? You can't crash just by crunching registers. You crash on memory access. And on divide by zero, if your processor can divide, which mine can't. If you want to crash, you better get yourself some bad pointers, or, if you can't find any, you'll need outright illegal instructions. Illegal instructions?! What's that smell in the air? Fire a memory view. View address: $pc. Compare the bits to the hexadecimal from the interlaced assembly listing. Guess what. They're different. And the stuff in the memory view doesn't disassemble very well, either. Somebody crapped over my instructions. Congratulations, somebody! I wonder who you are. Can't wait to find out.

Add a hardware watchpoint and run again. I sure hope it's the CPU and not a DMA controller or something. I sure hope it will be at the same address it was the last time. Bingo! And where is $pc? 0x97000176. No symbols. Of course – it points right into the flash. Bastardland!

What was that address where the Bastard's code defecates? Right, 0x76ffff56. Must be the location of one of his precious variables. You see, the Bastard wrote the boot loader for this board. A boot loader is what you burn to the flash, and what it does is it loads applications burnt somewhere else in the flash, and it handles interrupts. Because when you interrupt this processor, it jumps somewhere near the address where it starts running on power up. So the basic interrupt handling code naturally lands in the boot loader. And the boot loader happened to equally naturally land in the Bastard's lap. It is thus the Bastard's job to set up the stack pointer for the interrupt handling functions. And if there's one thing you can count on the Bastard to do, it's setting things up. He'll set them up, all right.

Keep shoveling through the assembly code. The Bastard clearly likes NOPs. Like, say, 20 in a row. Here and there and over there. Macros, he does not like. Repetition is at the core of learning, among other things. If we need NOPs, well, we'll use NOPs. NOP NOP NOP. And then hex hex hex. Oh, the Bastard loves hexadecimal. 0x77000000. And where does this go? Right here into $sp. The stack pointer for the interrupt mode. The stack then grows downwards, crapping over the instructions at 0x76fff… Bastard.

When you decide to carve out a chunk of memory to play with, it would at least be nice to put it in the linker script, so that code isn't allocated to the spot where interrupts are serviced. Better yet, why not have the app tell the boot loader where it likes its interrupt stack? It does provide an interrupt handler, by a clever method devised by the very same Bastard. Why is the stack pointer treated differently from the function pointer? Simple: the Bastard actually needed to have the boot loader call his particular function. Convincing the boot loader to jump to wherever the linker had put that function was easier than convincing the linker to put it where the boot loader jumps. The stack location? The Bastard couldn't care less where that was; he just stashed it somewhere past the end of the image of his tiny test program.

See, the Bastard is very goal-oriented and only implements the critical features. Translation: the Bastard only implements what the Bastard needs to report Success to whoever signs his paycheck. The relativity of Success is another matter, and one that the Bastard doesn't give a flying fuck about. The Bastard is like a QA guy, reversed. I once heard a QA guy say to a developer, "Where you people fail, I can make a living". And with the Bastard, it's "Where I can make a living, you people will fail".

And a quiet Bastard he is. Has anybody ever heard of 0x77000000? No, sir. The Bastard doesn't make much fuss about his mission-critical work. Well, better stick to that spirit, and write an e-mail to someone else. "We need the boot loader to…" Yeah, right. Maybe it will happen in a month or six. The thing is already burnt to quite some flash cards. Firmware. Firmware is not software. You don't just fix it. You beg the Bastard to fix it.

And in the meanwhile, let's make a nice big hole in the linker script, right there at the interrupt stack location. This sections goes below the hole, this section goes above. Oops, that doesn't link; the variables won't fit between the end of the hole and the end of the physical memory. OK, then this section goes below, and that section goes above. All right! The pseudo-small 4M app runs just like a tiger with his rear end on fire. Until someone adds one function too much. Or one variable too much. Must not break the delicate balance between the above-hole sections and the below-hole sections. Watch out for The Bastard's Black Hole, folks!

Why don't we have a word for it?

I'm going to talk about low-level vs. high-level. Again. It's complicated. Lots of angles. I need to warm my brain up. For a warm-up, let's start with an important theorem I've recently discovered.

yosefk's Semantical Decay Theorem: all useful terms which are not completely neutral become meaningless. Think about the definition of anything political, like "nationalism" or "democracy". For most people, it's little more than a synonym for "good" or "bad". Just an emotion; the meaning is long gone. Clarification: by "most people", I mean literally "the majority of the population". You know, the guys who decide who runs your government. I'm afraid "the majority of the people you hang out with" is a somewhat skewed sample.

And it's not just politics. Let's take a minute to browse through The Modern Software Industry Dictionary.


1. Good. Examples: "But is it an Object-Oriented language?", "Ewww, what an ugly, non-Object-Oriented interface!" Translation: "I don't know why I want something to be OO, but I want it, and I want it bad, 'cause everybody does. All the pros. I'm a clueless programmer thinking of myself as a pro. Maybe I'll get a clue some day, but don't put your money on it".

2. Bad. Example: "Just wonderful! What a huge pile of crud. 'Object-oriented' and all. These people have to make everything complicated." Translation: "Unlike my 'we-are-pros-we-like-OO' colleagues, I can't even understand what happens when you override a method. Where's my spaghetti dish?"

Real Time Embedded Software

1. Goodness. Examples: "In embedded applications, we don't need all those helpers and layers and interfaces." "In Real Time Software, you never have such features." Translation: "Yes, I like to hard-code M copies of N hexadecimal values into a K-screen do-everything function, for very large values of M, N and K. And there's nothing you can do about it, 'cause without me, you can't even boot the board. I'll do as I damn please."

2. Badness. Example: "Oh, right, that would be too sloooow for a so-called embedded device. Damn, why can't I use a normal CPU?" Translation: "My code is slow, because I'm too lazy to find out why and speed it up. My code also happens to be quite useless, because I'm too lazy to figure out how to make it useful. But you won't see how useless it is, because your hardware is only as fast as 2 or 3 Cray-1s, so it can't run my code."


1. Goodness. Example: "We'll rewrite this code, and this time, it will be Designed properly." Translation: " I'm a Software Architect. I know Design Patterns. Just look at this code. No Structure to it. It's time for someone like me to take over this project. Look, a loop! Calls for the Iterator Pattern. Ummm… Wanky wanky! Could you please close the door? It kinda feels awkward in public. Please don't interrupt my work."

2. Badness. Example: "Yeah, they want to 'design' it. Talking and writing and writing and talking. Instead of getting work done." Translation: "I write crap. We ship crap here, in case you didn't notice. Crap! Ever saw crap? Now get out of my way or I'll crap all over you!"

I don't know what's attached to "OO" or "RT/Embedded" or "Design" in your brain. Maybe it's something sensible. Maybe you know that there are different object systems and why they are useful or useless in your context and how to get by without an object system at all. Maybe you think that "embedded" means that a digital processor is designed to be a part of a specific system and run a specific application as opposed to being a general-purpose delivery platform. Or maybe you think that "embedded" means "low-end" or "small" the way "embedded" cross-compilers targeted at Windows CE seem to interpret the term. Maybe these words mean something to you.

I'm not going to discuss whether you attach the right or the wrong meaning to the words. I'm saying that attaching any meaning to them at all makes you a part of a minority. Whenever I hear "OO" or "RT/Embedded" or "Design", I know there can be only 3 options, listed in the decreasing order of probability:

1. The speaker is clueless or dumb or evil or all three combined.
2. The speaker managed to not notice that most frequently, the words are used by the people from option 1.
3. The speaker noticed it, but decided to heroically ignore it, because the word has a meaning, damn it, and I'll still use it, risking being classified as an option 1 guy, because I'm a Linguistic Hero.

All 3 options are suboptimal. Which is why I try to avoid these words altogether. It isn't very hard, because the concepts are fairly "thin" – you can simply fall back to first principles below them. Instead of "that's not Object-Oriented", you can say "What if we need another type of query? Why not have a class Query with a handle() method instead of the switch?" Instead of "that's not suitable for Real Time Software", you can say "a dictionary lookup would be slow; why not keep a pointer to the value inside the key object?". Instead of "that code has poor Design", you can say "I'm going to rewrite that code, because it's one big mess. What's that? 'It works?' Well, if you're under that impression, I'll be happy to transfer my responsibility for its maintenance to you. Oh, suddenly it doesn't look like 'working code' anymore, does it?"

And so on. I don't find myself needing a special word most of the time. Except for one case: "high-level" and "low-level". Maybe it's because I don't quite understand the underlying first principles, but for whatever reason, I really need a shortcut for that. Sadly but expectedly, "high-level" has made it to The Modern Software Industry Dictionary long ago. I planned to look it up and share the definition with you, but it turned out to be really long, with several different ways of equating "high-level" with "good" or "bad". So I decided to spare it and go straight to the definition found in the yosefk's Small Dictionary of Special Words He Still Can't Avoid:

special adj.

An attribute of formal interfaces, such as programming languages, operating systems, computer programs and libraries, and mathematical notation. An interface A has a higher level of abstraction than an interface B with respect to a concept C if the following holds: C can be expressed in terms of both A and B, but A comes with a special word for C.

Why do I need a special word for the property of having special words so badly? I'm not sure yet. Lots of angles. Complicated. Perhaps examples will help.

Example 1: C and C++ don't have reflection. This means that you can't take a pointer to an object and serialize it, because you don't know the size and the layout of the object pointed by the pointer.

Example 2: C and C++ don't have arrays. When you see a pointer, you don't know how many adjacent objects are pointed by it. This means that if you parse your compiler-specific debug information format, creating non-portable reflection, you still can't serialize arbitrary objects. Because if an object has a member pointer, it can really be a member array of unknown size. Oh, and it can point to a dead object. It can also point to a Segmentation fault or an Access violation, depending on your target OS. All these are non-problems in Java, which has reflection, arrays and garbage collection.

Example 3: MS Windows has buttons and menus and scroll bars and stuff. The X Window System doesn't have any of that. Instead, it comes equipped with design principles, such as "If a problem is not completely understood, it is probably best to provide no solution at all" and "If you can get 90 percent of the desired effect for 10 percent of the work, use the simpler solution." This means that on Windows, you can change the way widgets look throughout the system, and with X11, you can't.

Example 4: MS Windows defines a format for keeping the text & layout of dialogs, menus and the like. This crud is called "resources" and comes with a "resource compiler", translating an ad-hoc human-editable format into the binary machine-friendly format, and a set of clunky APIs for manipulating the binary format. This means that you can create an editor for translating the text on the buttons into a different language (and replace icons having text on them and flip dialog layouts for right-to-left locales). You can't do this to a program based on the knowledge that it uses X11 for the GUI. Of course you can only do this to a Win32 program if it was designed by drones who sticked to the ad hoc format by Micro$oft. If the program was written by Real People who threw together their own ad hoc format, you're helpless.

Example 5: C doesn't have garbage collection, but at least it comes with malloc and free (unlike COBOL, for example). You can and will forget to call free on a malloced chunk, but it is possible to build a platform-specific tool for tracing calls to malloc and free, and listing the call stacks at which leaked blocks were allocated. Of course nothing prevents people from reserving a large chunk of bytes and rolling their own memory management in this chunk. Using a plethora of custom "memory pools" is a C++ weenie's favorite way to defeat the tools for memory leak detection.

Example 6: Did I mention that C doesn't have arrays? Well, it doesn't. Suppose you have a function working with int* a and int* b, supposedly pointing to two arrays of length n. Of course they can really point to the same array, and the compiler has to make sure the code works in this case. And that would be the only problem if a and b were arrays. But they aren't. They are pointers. So a can point to b+1 or b-2 or wherever it damn pleases. This gets in the way when compilers optimize code as simple as for(i=0;i<n;++i) a[i]=b[i]*2; because, say, loading 4 values from b before storing values into a changes the semantics. This is known as the pointer aliasing problem and is a major pain in the ass for optimizing compilers; years and years passed before restrict was standardized, few people use it, and almost nobody knows what it really means.

Example 7: HTML has a concept of "document content". That's why if you have a WordPress blog, you can change your broken "theme" or "skin" or whatever they're called to some other broken theme or skin, and all your precious old content will get a spiffy new look. I think that MS Word doesn't have a user-visible high-level concept of "document content". The closest thing to it is probably the support for clipboard formats such as plain text or HTML. The latter isn't very appetizing. Much as I hate the WordPress entry composition window, I dropped the idea of copying from Word and pasting to WordPress, because there's too much crud to clean up.

My way to summarize this is as follows. In natural language, I find myself trying to avoid "special words" because they tend to be used for so many different things, so nobody can tell which one of those things you meant. And using "lower-level" sentences instead of "higher-level" words isn't much of a problem. People are awfully good at taking my low-level text and "raising" it back to the original high-level concept. But with formal machine-readable languages, it's the exact opposite, because machines are awesome at lowering and they suck at raising. Machines don't have a problem figuring out what a Java array or a Win32 resource or a C call to malloc means. You never get to a point when a machine tells its fellow machine, "I've heard programs say 'malloc' so many times in so many different contexts that I just don't know what it means anymore". On the other hand, try building a program reading your code and spotting all places where custom memory management occurs and instrumenting these spots to monitor leaks at run time. Machines can't do that, but people can ("oh, look, char g_pool[maxturd]!").

People are stunningly good at raising things to a higher level. I think this is one reason people believe in X11-style design principles ("provide low-level mechanisms, leave high-level policies to users"). There's another reason to favor this approach – it results in less work. This reason is perfectly legitimate – you only have that much time. After all, if you have infinite resources, everything becomes a no-brainer (you can compute the optimal chess move in every position, for example, so the game becomes rather boring). Basically, the only thing "smart" means is "capable of sensibly utilizing finite resources". So if I lack resources for defining and implementing a higher-level interface, and I cop out, don't call me names until you can squeeze that work into a budget like mine.

But going low-level kills metaprogramming, so treating it as a value instead of a necessary compromise is, I think, wildly wrong. Metaprogramming can be considered worthless from the user perspective. There are actual pseudo-business-oriented people out there who label things as economically worthless because "users don't see them". These people deserve a separate discussion (and a spanking, but we don't do physical violence here; stay tuned for some R-rated virtual violence on the subject). The whole point of the examples above was to show how metaprogramming, made possible by high-level interfaces, enables implementation of user-visible features or detection of user-visible bugs. The funniest thing on the list of features is performance (definitely user-visible in a variety of interesting cases). Everybody knows that "higher-level is slower", for many values of "everybody". However, program optimization is just one kind of metaprogramming; what makes metaprogramming hard, makes optimization hard. The lack of arrays in C and C++ gets in the way when you serialize just like it gets in the way when compilers optimize. If you're going to deal with arrays, you better have a special word for it in your language.

Does this mean that "high-level" is, at the bottom line, a synonym for "good", and should rightfully be listed in The Modern Software Industry Dictionary as such? I wish I could say yes. I think that high-level/low-level is an extremely complicated question. Angles and all. I think I'll spill out some more half-baked thoughts on the subject. I also think that choosing the right level of abstraction is The Grand Challenge when you design something. Oops. Um, forget that I used that word. I meant "when you define something". This word seems to still have a meaning. I wonder how long it will last.

Teeth marks at the rear end

Today we're going to discuss Good Things. Good Things, says I, are the ones that don't bite your butt. A Goodness benchmark can thus be devised as follows: if the user of a Thing has few teeth marks at his rear end, the thing is Good. Now relax and bare with me while I approach the topic sloooowly, but suuuurely.

Like most programmers, I suffer, by nature, from Environmental Laziness. I don't like to switch environments and then spend time learning new ways to do old things. This means that my natural tendency is to stick to one OS, one programming language, one text editor, etc. I realized at some point that this approach sucks, and consciously try to fight it. But I know that if I just let myself be, I'm as Environmentally Lazy as it gets, and no matter what I do, it shows through.

There's a rather small, but colorful group of Environmentalist programmers who love new environments. This group is colorful because each of them runs a different, unimaginably obscure combination of software. For example, I sit next to a world-class Environmentalist at work, and I asked him once, noticing the mosaic of windows at his screen, "Why are you simultaneously launching 12 Eclipses?" (He used Eclipse as a front-end to some weird version of gdb talking to some weird board, but that's another matter). He slowly turned to me, his expression conveying his trademark fake hostility towards the disturbingly stupid mortals, and explained: "Because I'm simultaneously running 1 Ion". Ion is a window manager that will carefully replicate the splash screen of Eclipse, and most other programs. Its author will then tell you that it's Eclipse's, and everybody else's, fault, because they are braindead applications that won't follow the ICCCM. Incidentally, the author is also an "environmentalist" in the common sense of the word: he despises cars and uses a bicycle. At -30C, when bicycles break from the cold. Explains a whole lot to me. I come from Russia, and at -30C, they don't send kids to school to not have them freeze their faces off or something. Urine doesn't freeze in the air at -30C, but it gets way too close to that point. Anyway, if a guy runs Ion, you can count on him being a die-hard Environmentalist, and on his display to contain colorful content at all times.

I always look for Environmentalists around me because they can give priceless advice when you need to choose what software to use between several alternatives. And then when you need to install and configure that program. And then when you upgrade it and your configuration no longer works. And then when you decide to switch to some other program. Environmentalists are really good at this, and I really suck at this, so together, we make a great team as far as I'm concerned. However, I don't think that Environmentalism correlates with other sorts of programming aptitude. I know an Environmentalist, with Emacs customizations and Linux at home and all, that actually said the following sentence and meant every word: "The program entered an infinite loop and then exited it". On the other hand, Probably The Strongest Hacker I've Seen lives in a gvim with appalling, white-on-yellow string literal highlighting. Probably copied the wrong combination of files from other people. And he lives with that, because he just can't be bothered to fix it, although he's awfully sharp and obviously would be done with it in no time. But then the guy with what looks like 12 Eclipses but is in fact 1 Ion is also awfully sharp. Environmentalism doesn't correlate with anything. To me, an Environmentalist is someone who can help if you need to change your environment, but quite likely will have trouble helping you if you don't need to change it. Because some of them just can't work anywhere outside of their customized-to-death workstations. And that's all I know about an Environmentalist a priori; could be any sort of programmer, really.

I think that Environmental Laziness is a manifestation of a more generic personality trait, which I call Neophobia, The Fear of The New. Today, we know which mushrooms are edible and which aren't, thanks to N+1 heroes of the ancient times: N neophiliacs and 1 neophobiac. The saga of their adventures goes like this. The Gang of N+1 traversed the virgin woods of the planet. Each time they stumbled upon a new sort of mushroom, a neophiliac would happily grab a sample, taste it, swallow and say "ummm, interesting!", genuinely amused. In a couple of hours, he would drop dead, which was methodically recorded by the neophobiac on the team: "poisonous". As you can easily prove by induction, at the end of the journey our gang had the size of N+1-K, where K is the number of poisonous kinds of mushroom on Planet Earth. K turned out to be rather large, which explains the relatively rare occurrence of the Neophilia gene in the modern population. But we should obviously cherish and encourage those scarce leftovers, because who else will we use to tell a poisonous window manager from an edible one?

Anyway, the purpose of this whole introduction was to establish the simple fact: I'm a hopeless, Environmentally Lazy Neophobiac. When I sense unknown taste, my reflexes actually want me to throw up ("hey, it's not my job to taste new kinds of mushrooms, I'm the guy with the notebook!", says the reflex). In the civilized age, this body design has its drawbacks since the new tastes normally introduce themselves at social gatherings, where vomit is most often unwelcome. I'm generally in control of my reflexes, so if you're having a party and you wanted to invite me until you read this, that issue shouldn't bother you (a couple of other issues probably should, but I digress). But at the core, below the level where I look like a toilet-trained civilized primate, I hate new things. Probably genetic. And now that we're through with the introduction, we finally reach the second introduction, the one about programming languages.

Back at the university, I learned C++, Java, Matlab, Scheme and ML. Not entirely unexpectedly, Scheme and ML was taught by arrogant weenies who never bothered to check what happens outside of the Pure Functional world. Their not-entirely-correct statements of the form "you can't do that in C++ or Java" earned them the solid credibility value of 0. As to programming assignments, they of course concentrated on the most practically useful ones. Like implementing a Scheme interpreter in Scheme, or doing type checking according to the rules of the ML type system in ML. So the primary thing that I learned about Scheme and ML was that some dead languages and programming styles are apparently particularly interesting to a particularly uninteresting kind of people. Can't really count it as learning Scheme or ML, even though I successfully finished implementing my useless interpreter and type checker. Took me lots of time to understand what's the deal with all those apparently reasonable people loving Lisp and OCaml.

And Matlab is really a DSL. A great DSL, but you won't use it for fiddling with text and files, for example. So my only readily available options for fiddling with text and files were C++ and Java. Tough choice, because both awfully suck at this sort of thing, but somewhat differently. C++ is verbose, has disgusting string manipulation facilities, compiles slowly and dumps core, while Java is even more verbose; verbosity is a big problem, perhaps I'll (verbosely) talk about that some time. I settled for C++ because that's what everybody else was using around me.

It didn't feel very productive; things which seemed like they should be trivial weren't. So I've been laaaazily looking around for some other way to fiddle with text and files. Shell scripts. Crippled variables, ugly control flow, non-existent or exceptionally painful error handling, and program-specific command line syntax used for most of the "argument passing". Screaming fucking bloody mess, to quote John Lydon. sed. Just a little bit too specialized for my taste. awk. Good when a program fits into a command line. Programs that don't fit into one screen reportedly shocked the authors of the language themselves, and that was at the time when screens were much smaller. Perl. A superset of C, shell, sed and awk. The man pages actually address "celebral C programmers", "sharp shell programmers", "accustomed awk users" and "seasoned sed programmers", to help them migrate to this new C++, sh++, awk++ or sed++, depending on their background. And my background makes it Cshawksed+=4.

This, folks, is insanity. But wait, we have a local ASIC wizard that happens to be an Environmentalist and a Perl user. He'll help me if I get stuck. And Perl does seem to make normal bread-and-butter stuff easy and cheap enough. Variables, arrays, conditions, loops, functions, arithmetics, that kind of thing. There are actual languages out there that have the guts to make this stuff less than straightforward. Like Tcl. Or Prolog. Try to implement a goddamn word count program in Prolog. Fuck Prolog! Seriously, I read that Erlang was first implemented in Prolog and what this means is that I don't understand anything about anything, because in Prolog, I feel that my hands are tied behind my back, and someone keeps spitting at my face. So anyway, Perl. We have an Environmentalist and we have the bread-and-butter things and we have modules to stuff those things into. All righty then.

So I occasionally used Perl for fiddling with text and files, but all this code was single file programs. I never got to a point where I actually evolve and reuse libraries in Perl. I think I'm not a Perl guy. There are two camps of programmers, the programming-as-definitions camp ("this function returns the number of frames in a movie clip") and the programming-as-actions camp ("this function opens an MPEG files and returns the number of frames listed in this header"). The Action guys tend to like Perl, because it lets them do things easily. The Definition guys tend to sneer at Perl, because they don't understand what all those things mean, and why; too many arbitrary things, if you ask them. Me, I'm a Definition guy, struggling to get in touch with his Action side. Deep, isn't it? Anyway, Perl clearly wasn't for me. I used it, and kept thinking about this strange situation where you have Real Languages, like C++, which make it hard to do things, and then there are Toy Languages, like Perl, which make things easy but of course you can't do Serious Programming in them. You are right. It does sound moronic. I felt the paradox. I think of myself as a person who can't say "the program exited an infinite loop". So I couldn't help seeing the paradox. But I was lazy, so I didn't act on it. Programming languages. C'mon. They're all the same. Loops, arrays, functions. The last thing I want is more new ways of doing the same old things. I know more than enough kinds of edible mushrooms already. Leave me alone.

Then, several things happened. First, I found a critical mass of Python speakers near me. With a couple Environmentalists among them, which made Python an option for me, too, because now I had them to get me out of jams, which was comforting. Second, the "Real Languages are harder to use" paradox became increasingly itchy, as I began realizing that the more you try to plug the holes in C++, the more problems you end up with. Third, a guy who joined me on some project disliked the textual interface to a bunch of C++ code I wrote. Everybody disliked that textual interface; it sucked. Why not wrap your code in Python bindings instead, he asked. Could be a good idea, I said, never tried it though, would you do it? Having the Environmental gene, he agreed. And then it turned out that I need to work a lot against this interface, because I need it for testing hardware, and when you make hardware tests, you make lots of them. The stuff going through this interface grew hairy, so I added a wrapper layer, 2-3 thousands lines of Python (for those who live in the C++ land – that's 10-15K LOC for you). Then I wanted to hook it to some other old C++ code of mine. More bindings… Meanwhile, people around were busy churning out Python code. I found myself looking for a way to produce debug information for a new debugger front-end, written in Python. And do it fast. Think, think, think. OK, I'll simply emit Python code for it to eval… More Python. Python is all over the place. I guess that makes me a Python programmer.

I never learned Python. All these introductions were introducing this single little factoid, which I find totally shocking. I NEVER LEARNED IT. How can it be?! It's a programming language, damn it. Let's take an example of a programming language. Um, I dunno, C++? OK, our example is C++. Can you imagine someone trying to make sense of C++ without learning it, just by reading code, writing code and looking things up on demand? Ha! No, make that "Ha" squared and raised to the power of "get out of here". OK, C++ sucks, we've already got enough of that at yosefk.com. Perl. You can't mentally parse Perl without knowing quite some of it. And even then, only perl can parse Perl. See? You ought to learn programming languages. Read a book or a tutorial or something. Not a very big deal, by the way, except most people are so damn Environmentally Lazy that if I weren't like that myself, it would really bug the hell out of me. You ought to actually learn, and people don't want to learn, and that's why they prefer to use fewer languages and ignore the itching. OK, so the strings are white-on-yellow. So the error messages occupy the better part of the screen. To quote John Lydon again, "I'm a Lazy Sod! I'm so Lazy! Can't even be BOTHERED!!"

But you see, we aren't necessarily talking about Bad Laziness here, and no, programming languages don't have to require that much learning. At this point, I am supposed to write lots of explanations, which I will optimize out by linking to the book User Interface Design For Programmers by Joel Spolsky. I mention it because of two of its virtues: (1) It perfectly makes and explains my next point and (2) Quite likely you've already read it, because everybody reads Joel On Software, so you know what I'm talking about. What I'm talking about is what this book calls the cardinal axiom of user interface design: "A user interface is well-designed when the program behaves exactly how the user thought it would." I won't argue why this makes sense; this is discussed in the beginning of the book, and I couldn't say it better. What I want to argue is a complementary claim: "Every interface is a user interface." The kind of user we're dealing with varies, but whichever kind it is, the last thing we want to do to that user is to surprise him or her or it (some interfaces are for machines; these are normally easier to get right).

In particular, a programmer is a kind of user, and a programming language is a kind of user interface. So is a function or a class or a library. If the programming language or a library doesn't do what actual programmers out there think it would do, than it sucks. If it goes as far as silently doing the wrong thing instead of telling the user about the mistake in a clear way pointing to the exact source of the problem (I'm talking to you, C++ template weenies), then, my friends, it totally and uncompromisingly sucks. Period.

Some people never think about it this way. Others fail to get it. Most of them are selfish, no-good egomaniacs who always prefer to inflict arbitrary amounts of pain on their users in order to save themselves a tiny bit of work and/or implement their religious beliefs about the essence of engineering aesthetics, the money of their employer be damned. In the truly advanced cases, their aesthetics actually boils down to "it's beautiful if I don't have to do anything". Once their instincts and their religion converge to this single point, these people enter a state of perfect internal harmony which can only be disturbed by a loaded shotgun. Well, if you are one of those people, stop reading now and go away. See? We don't welcome selfish, no-good egomaniacs here. Now that we've only got the caring-about-their-users kind of people with us, the situation is intimate enough for me to show you the teeth marks which Python has left on my butt.

You see, nothing is perfect. And Python certainly is no exception. I'm no big fan of Python. In particular, like all other "modern" "dynamic" languages, Python runs awfully slowly, and for no good reason, most of the time, and it won't be fixed unless they change the language spec. I can say a lot more. But it bites your butt quite rarely. Isn't that nice? I came to appreciate it. I think that when people say that X is designed in a "tasteful" way, the major thing they are telling you is simply "X isn't into butt biting". But you can't always win; either you're going to invest time into learning and memorizing the rules, or you will guess them wrong, earning yourself a teeth mark.

For example, it took me a while until I realized that Python was, in fact, evaluating my definitions, one after another, like there's no tomorrow. Without any kind of lookahead. Evaluation is neither compilation nor execution. You see, in C, you can't use a function before declaring it, because it compiles everything in one pass. In Java, you can use a function anywhere in the code, because first of all, it figures out which functions are defined, and only then issues errors about undefined this and unknown that (C++ does it, too, in some places, but not others). In Python, you can call a function you haven't declared yet, so I figured it was pretty much like Java. I didn't really think about it; let's just go with the flow, I said to myself. Lazy of me, isn't it? I actually wanted that teeth mark; it was an experiment: how lazy can you get, and how much will it hurt?

Well, what Python really does is it doesn't care if something is defined in a namespace until it absolutely has to know; but it doesn't bother to look ahead. When you define a function f that calls a function g, Python only needs to compile code, not run it. OK, let's compile a call to some "g" thing, maybe it's defined somewhere already, maybe not yet, what do I care? But if you call g before it's defined instead of defining a function calling g at that same point, Python will fail, because now you're really asking it to find g, right there on the spot, and it didn't see anything like that yet, and it can't proceed, because you ordered it to do something now. Makes perfect sense for a dynamic language, makes no sense for a static language (people use a bogus distinction "interpreted"/"compiled" in this context, but they shouldn't, because it's bogus; it refers to the implementation when you're really talking about binding rules). I came mostly from static languages; I didn't expect this kind of thing. Mac looks clunky to a novice user coming from a Windows background, as the example from Joel Spolsky's UI book shows. Bite! Ouch!

Wait, where's the actual bite? He promised to show us some ass, with teeth marks and all! Where's our metaphorical software weenie porn? OK, here it goes. Consider the definition def f(arg={}): ... This defines a function f with an argument arg, and its default value is an empty dictionary. Now, suppose you call f with the default argument, and then f puts some values into the dictionary. Next time you call it without explicitly passing an argument, the dictionary will no longer be empty. What?! I wanted arg's default argument to be an empty dictionary! But it's not what Python did; arg's default argument is in fact the empty dictionary, the particular object created when Python evaluated the expression "{}". No, it doesn't reevaluate it each time when you call f, and this is consistent with the way Python normally behaves: reads and evaluates your code right when it bumps into it. REPL. Read-eval-print loop. Totally obvious, unless you come from a static languages. Well, I got used to this The Empty Dictionary business, and even used it as a "feature" in throwaway contexts (you can fake C-like static variables by adding an undocumented argument with a default value to your function; I was then shocked to see Real Pythonistas use this Industrial Strength Technique, too).

Right near the scar from the {}-shaped bite, there's the []-shaped teeth mark. This one's very famous: a=[[]]*5. Looks like a list of 5 empty lists, but it isn't. [] is evaluated once to become an empty list, and then you get a list with 5 pointers to that empty list. Now, when you insert an element to a[0], you'll find out that it was also added to a[1]…a[4]; that will be surprising. REPL! Bite! Ouch! I really wanted [[] for x in range(5)]; in list comprehensions, the expression is reevaluated many times, because otherwise you wouldn't be able to create different elements, which is the whole point of list comprehensions. And in fact I wanted different elements, and not the same empty list pointed to from 5 places, so list comprehensions are the way to go here. My fault. I've been lazy. See, I can admit it when it's my fault. Sometimes. Not if your fucking thing keeps biting me over and over and over again though. Then it's your fault. Stop arguing, you clueless weenie. I'm The User! "User". Ever heard of users? These are the guys who use your stuff. It's like "customers". Probably heard about those ones, didn't ya? Shut up and give me some respect!

OK, let's look at some more ass. Let's see, where else did the deadly snake bite me? Imports. I so don't understand what import actually does that I'm seriously considering looking it up. I don't remember where the actual bite marks are, but it got tricky a couple of times. Name errors 5 minutes after a program started running. I misspell locals and the frigging thing compiles a reference to a global, hoping that it will be defined somewhere, later. Pure stupidity, if you ask me. REPL or no REPL, but anybody who tells me that this behavior is The Right Thing could just as well said, "Don't bother talking to me about Python – I'm a Python weenie! Weeeeenie! Isn't that cute?!" "But why didn't you test that function before running it in your program that takes whole 5 minutes to run?" Because writing that test would itself take 5 minutes if I were lucky, Mr. Weenie. And in fact, reasonable Pythonistas acknowledge that yes, it bites them, too, and it would be great if it didn't happen all the time. Oh, and lexical scoping is a great gadget, you know. When I have a variable x, I sure as hell don't want it to get overwritten by the x in [x**2 for x in range(5)]. Is this obvious, or are you Miss Weenie, the sister of our friend the test-driven Python apologist?

More ass. We want ass! We want ass! Sorry, we're out of ass. I don't remember any other class of ass bite by Python. And I have a memory for that thing; soft skin on my buttocks or something. And I didn't even learn Python; I'm still running the experiment and insist on Not Reading The Fucking Manual, although it's well past time when I should. Compare my results with Python to C++, which I did learn (read two books and a half, took a mandatory course in the university, read stuff on the net and so on). Net result? Bite! Bite! Bite! Bite!! Bite!!! Yeah, yeah, I know that you "don't have any problems with C++". Believe me, I know what you're talking about; I didn't have any problems with C++, either, and you can ask just about everyone who's ever worked with me whether I had problems with C++, and I can guarantee you that each and every person will laugh at that question. I know why you don't have problems. You can no longer see the problem when you have one. Allocate a vector of vectors of bytes. Start pushing elements into it. Core dump. Oops. I kept a reference to the bytes in a previously pushed byte vector and used that reference after pushing another byte vector. Um, I didn't think I'd need that. OK, make it a list of vectors of bytes. Wait, do I need the random access I've just given up? No, I guess I don't. Maybe I'll need it later. Well, then I'll change the element type to some smart pointer other than vector… No problem, I'm in Full and Total Control. Yes, it bites me, but I keep moving. Sloooowly. But Suuuurely. Problems? What problems? Sure, C++ is harder to use than those other languages, but that's because they aren't Real Languages! Makes perfect sense, doesn't it?

I'm telling you, C++ is such an easy target that once you start bashing it, it's hard to drop the habit. But I have to. I have to go back to our subject. Why is Python so easy to not learn, and what should I copy from it if I want my software to be usable?

I don't really have an answer, not an orderly one I can serialize into text anyhow. I'm working on it; I find it fascinating, in particular, because I work on programming languages. They are DSLs, essentially, but relatively fat ones, Turing-complete, with arrays, functions and all. I defined and wrote the interpreted implementation of one small one, and "managed" (talked to and tried to not get in the way of) people who made one big one. I have practical interest in programming language usability, but I'm still thinking about it. I can only say vague stuff right now. For example, it looks like you can take many of the things in the UI-for-programmers book and translate them to the vocabulary of programming languages or libraries and they'll still work. Good book.

One thing I can surely tell is this. Making the program act the way someone expected it is Hard, because a Someone is harder to model than, say, a Something. But making the program complain when it's unhappy is way easier. It's like bathing: if you do it regularly, you'll stay clean without much effort. If you stop bathing for a while, you'll probably run into problems that a single bathing session won't quite cure. But most people don't have first-hand experience on that one, and I doubt that they miss anything worthwhile. Error handling is similar: every time you write a line, you can think about stuff that could go wrong, and insert detailed whining into the program. Not some stupid fucking numeric error code, detailed whining. A stupid fucking numeric code will lead to "rm: Is a directory". I despise software that tells me that something, somewhere, is a directory, and finds that insightful.

With Perl, every time you make an error, it tries to do something sensible (convert the string to a number or vice versa, etc.) C++ dumps core or corrupts data. Python throws an exception with a call stack. I wish it wouldn't destroy the original context with all the variable values and stuff, but it's pretty good already. I've tried to take a fairly anal-retentive approach to error handling in my "programming environment" sort of work, with quite elaborate code to make user-defined tests easy and all that, and I think it pays off. It helps people lose fear, and "fearless" is "happy". For example, with Python, I know that if I foul up, most likely it will throw an exception rather than go ahead and commit a horrible atrocity. Helps you loosen up and try things out, and then sleep quietly once you're done. But failing early and informatively is only the next best thing after not failing, and to not fail, you need to make your system behave as expected. Which is somewhat harder than regular bathing. Rest assured that when I'll know enough about it to convert it to text, you'll find that text here, and one thing it won't be is it won't be particularly short.

P.S. I've mentioned the camp of Sadists or plain selfish people who inflict suffering on their users. There is, of course, a corresponding camp of Masochists who like to suffer. You know, people who don't have problems with C++ and the like. The psychology of technical masochism is beyond the scope of this entry; it's sufficient to note that if someone claims to be using tools which are good for pros but bad for newbies or even generally "harder to use" than lesser tools, it's quite likely a symptom of this dangerous disorder. Technical masochists looking for a cure are encouraged to read Chapter 6 of the UI book, "Designing for People Who Have Better Things To Do With Their Lives". Then you could think about things to do with your lives that you find better than licking the scars your favorite tools leave at your rear end. And at least stop calling people who have better things to do with their lives "lazy", without mentioning that this is really a good kind of lazy. Thank you.