AI problems

Animals have a uniform, closed architecture. The human brain is an open platform; people get by using a wide variety of techniques called "professions". The flexibility has its drawbacks. We aren't tuned for any particular profession, and apparently that's why everybody develops some sort of profession-related problems. Proctologists reportedly turn rude as time goes by. Rock stars live fast, but die young. Hunters in the African deserts get to chase antelopes for a couple of days. Antelopes run much faster, so you would never try to chase one; but the hunter knows better – compared to him, the animal lacks stamina, and will get tired, and then he can kill it. But sometimes the hunter has the misfortune of chasing a particularly strong antelope, in which case he still won't be able to get close enough at the end of the second day. But having wasted all that energy, he now certainly has to refuel, so he settles for a nearby half-rotten corpse. The effect of that sort of meal on his digestive system is one problem that comes with his profession.

Programmers develop their own problems. Today, we'll talk about AI problems some of us are having. As you probably already know, but my trademark thoroughness still obliges me to say, AI stands for "Artificial Intelligence" and comes in two flavors, "deterministic" (like minmax) and "statistical" (like SVM). The combined efforts of various researches lead to an important breakthrough in this field, known to meteorologists as "the AI winter". This is the season when you can't get any VC money if you mention AI anywhere in your business plan. During this season, an alternate term was invented for AI, "Machine Learning". I think that the money/no money distinction between "ML" and "AI" isn't the only one, and that in other contexts, AI=deterministic and ML=statistical, but I don't care. In real systems, you do both. Lots of things labeled as "AI" work and are useful in practical contexts. Others are crap. It's always like that, but this isn't what I came to talk about today. By "AI problems", I didn't mean the problems that people face which require the application of methods associated with the term "AI". I meant "problems" in the psychiatric sense.

A certain kind of reader will wonder whether I have the necessary qualifications to deal with a psychiatric issue so advanced. My credentials are humble, but I do work on hairy computer vision applications. The general problem computer vision deals with (identify, classify and track "objects" in real-world scenes) is considered "AI complete" by some, and I tend to agree. I don't actually work on the AI bits – the algorithms are born a level and a half above me; I'm working on the hardware & software that's supposed to run them fast. I did get to see how fairly successful AI stacks up, with different people approaching it differently. Some readers of the credential-sensitive kind will conclude that I still have no right to tackle the deep philosophical bullshit underlying Artificial Intelligence, and others will decide otherwise. Anyway, off we go.

The AI problems make a vast area; we'll only talk about a few major ones. First of all, we'll deal with my favorite issue, which is of course The Psychophysical Problem. There are folks out there who actually think they believe that their mind is software, and that consciousness can be defined as a certain structural property of information processing machines. They don't really believe it, as the ground-breaking yosefk's Mind Expansion Experiment can easily demonstrate. I'll introduce that simple yet powerful experiment in a moment, but first, I want to pay a tribute to the best movie of the previous century, which, among other notable achievements, provided the most comprehensive treatment of the psychophysical problem in the popular culture. That motion picture is of course The Terminator, part I and, to an extent, part II. World-class drama. Remarkable acting (especially in part I – there are a couple of facial expressions conveying aggressive, hopeless, cowardly and impatient stupidity previously unheard of). Loads of fun.

Back to our topic, the movie features a digital computer with an impressive set of peripheral devices, capable of passing the Turing test. The system is based on Atari hardware, as this guy has figured out from the assembly listings cleverly edited into the sequences depicting the black-and-red "perspective" of the machine. According to the mind-is-software AI weenies, the device from the movie has Real Consciousness. The fascinating question whether this is in fact the case is extensively discussed in the witty dialogs throughout the film. "I sense injuries", says the Atari-powered gadget. "This information could be called pain". Pain. The key to our elusive subject. I'm telling you, these people know their stuff.

The mind-is-software approach is based on two assumptions: the Church-Turing thesis and the feelings-are-information axiom. In my trademark orderly fashion, I'll treat the first assumption second and the second assumption first. To show the invalidity of the feelings-are-information assumption, we'll use yosefk's Mind Expansion Experiment. It has two versions: the right-handed and the left-handed, and it goes like this. If you're right-handed, put a needle in your right hand and start pushing it into your left arm. If you're left-handed, put a needle in your left hand and start pushing it into your right arm. While you're engaged in this entertaining activity, consider the question: "Is this information? How many bits would it take to represent?" Most people will reach enlightenment long before they'll cause themselves irreversible damage. Critics have pointed out that the method can cause die-hard AI weenies to actually injure themselves; the question whether this is a bug or a feature is still a subject of a hot debate in the scientific community. Anyway, we do process something that isn't exactly information, because it fucking hurts; I hope we're done with this issue now.

Some people don't believe the first of the two above-mentioned assumptions, namely, the Church-Turing thesis. Most of these people aren't programmers; they simply lack the experience needed to equate "thinking" and "doing". But once you actually try to implement decision-making as opposed to making the decision yourself, your perspective changes. You usually come to think that in order to decide, you need to move stuff around according to some procedure, which isn't very different from the method of people doing manual labor at low-tech construction sites. Thinking is working; that's why "computational power" is called "power". I've only heard one programmer go "…but maybe there's a different way of thinking from the one based on logic". I couldn't think of any, except from the way based on psychoactive chemicals, maybe. "A different way of thinking". To me, it's like arguing that you can eat without food or kick ass without an ass, and I bet you feel the same way, so let's not waste time on that.

Next problem: some people actually think that a machine will pass the Turing test sooner or later. I wouldn't count on that one. Physicists claim that a bullet can fly out of one's body with the wound closing and healing in the process, because observations indicate that you can get shot and wounded, and if a process is physically possible, that same process reversed in time is also physically possible. It's just that the probability of the reverse process is low. Very low. Not messing with the kind of people who can shoot you is a safer bet than counting on this reversibility business. Similarly, the Church-Turing claims that if a person can do it, a universal computing device can emulate it. It's just the feasibility of this simulation that's the problem. One good way to go about it would be to simulate a human brain in a chip hooked to enough peripherals to walk and talk and then let it develop in the normal human environment (breastfeeding, playing with other kids, love & marriage, that kind of thing). The brain simulation should of course be precise enough, and the other kids should be good kids and not behave as dirty racists when our Turing machine drives into their sand pit. If the experiment is conducted in this clean and unbiased way, we have a good chance to have our pet machine pass the Turing test by the time the other kids will be struggling with their IQ tests and other human-oriented benchmarks.

Seriously, the human language is so damn human that it hardly means anything to you if you are a Turing-complete alien. To truly understand even the simplest concepts, such as "eat shit" or "fuck off and die", you need to have first-hand experience of operating a human body with all of its elaborate hardware. This doesn't invalidate the Church-Turing thesis in the slightest, but it does mean that automatic translation between languages will always look like automatic translation. Because the human that can interpret the original that way clearly lives inside a box with flashing lights, a reset button and a ventilator. For similar reasons, a translation by a poorly educated person will always look like a translation by a poorly educated person. I know all about it, because in Israel, there's a million ex-Russians, so they hire people to put Russian subtitles into movies on some channels. Unfortunately, they don't seem to have any prerequisites for the job, which means that I get to read a lot of Russian translations by morons. Loads of fun. These people equipped with their natural intelligence barely pass the Turing test, if you ask me, so I keep my hopes low on Turing-test-passing AI.

Moving on to our next problem, we meet the people who think that we actually need AI. We don't. Not if it means "a system that is supposed to scale so that it could pass the Turing test". And this is the only thing AI means as far as I'm concerned here. We already have "artificial intelligence" that isn't at all like our natural intelligence, but still beats our best representatives in chess, finds web pages, navigates by GPS and maps and so on. Computers already work. So the only thing we don't have is artificial intelligence that simulates our own. And this is as tremendously useless as it is infeasible. Natural intelligence as we know it is a property of a person. Who needs an artificial person? If you want to have a relationship, there's 6G of featherless two-leg Turing machines to pick from. If you want a kid to raise, you can make one in a fairly reliable and well-known way. We don't build machines in order to raise them and love them; we build them to get work done.

If the thing is even remotely close to "intelligent", you can no longer issue commands; you must explain yourself and ask for something and then it will misunderstand you. Normal for a person, pretty shitty for a machine. Humans have the sacred right to make mistakes. Machines should be working as designed. And animals are free to mark their territory using their old-fashioned defecation-oriented methodology. That's the way I want my world to look like. Maybe you think that we'll be able to give precise commands to intelligent machines. Your typical AI weenie will disagree; I'll mention just one high-profile AI weenie, Douglas Hofstadter of Gödel, Escher, Bach. Real-life attempts at "smart" systems also indicate that with intelligence, commands aren't. The reported atrocities of the DWIM rival those of such precise a command as "rm .* -rf", which is supposed to remove the dot files in the current directory, but really removes more than that.

Finally, many people think that AIish work is Scientific and Glamorous. They feel that working on AI will get them closer to The Essence of The Mind. I think that 40 years ago, parsing had that vibe. Regular, Context-Free, automatic parser generation, neat stuff, look, we actually know how language works! Yeah, right.

You can build a beautiful AI app, and take your experience with you to the next AI app, but you won't build a Mind that you can then run on the new problem and have it solved. If you succeed, you will have built a software system solving your particular problem. Software is always like that. A customers database front-end isn't a geographical database front-end. Similarly, face recognition software isn't vehicle detection software. Some people feel that mere mortal programmers are biting bits, some obscure boring bits on their way to obsolescence, while AI hackers are hacking the Universe itself. The truth is that AI work is specialized to the obscure constraints of each project to a greater extent than work in most other areas of programming. If you won't take my word for it, listen to David Chapman from the MIT AI Lab. "Unlike most other programmers, AI programmers rarely can borrow code from each other." By the way, he mentions my example, machine vision, as an exception, but most likely, he refers to lower-level code. And why can't we borrow code? "This is partly because AI programs rarely really work." The page is a great read; I recommend to point and click.

As I've promised, this wasn't about AI; it was about AI-related bullshit. And as I've already mentioned, lots of working stuff is spelled with "AI" in it. I've been even thinking about reading an AI book lately to refresh some things and learn some new ones. And then lots of AI-related work is in Lisp. They have taste, you can't take that away.

A writer of the lame kind

I've started to type this biggish draft, and got stuck. I've been writing too much lately, that's why it happened. A tech writer has just send back the polished version of a spec I wrote in a wiki, so I didn't know how many pages it was. Now I know. 221 pages. "This page was intentionally left blank 'cause we want a cool page count". How did I write all this?

A friend of mine once advised a friend of his thusly: "You ought to study CS basics. Go programming without it, and instead of a programmer, you'll become a lame sort of writer". He referred to the would-be coding as "writing"; he's all like that. No boundary between brains and machines whatsoever in his mind. Coding, writing, same shit. His basic measurement unit, applied to all natural phenomena, is the stupidity. Any behavior of a system is a special sort of stupidity. "No, that function doesn't find the right data; it finds its own stupidity". People are stupid. Programs are stupid. Stupidities interact. The world is one big collision of stupidities. Wits are a special quality of stupidity (probably when it's isomorphic to some part of external reality). And so on.

At least in his example, the "writer" would be writing code at work. I haven't done anything interesting in the last, let me see… 2 years?! Shit! I've done coding, but always some kind of a no-brainer. Lots of debugging, that's pretty creative when the bug itself is creative enough, but it's not the same. Getting from -N to 0 isn't nearly as interesting as getting from 0 to N, Net Company Productivity aside. What did I code? Glue code, basically. Test code. Mildly interesting smaller parts of things. Mildly interesting small programs for shoveling through data. And what else? Talking, writing, "managing", "advising". What about The Real Thing? When you ride this wave and a Something emerges? The last time that happened was, ahem, 2005, part 2006. And this is 2008. Not good.

You see, I had several ideas during that period, but other people got to implement them or those ideas died or went comatose. Because I was doing all this important crap. I was working on something too large for just myself, and other people joined, and I figured I'll do all the no-brainer crap, because who wants to join if they get the crap? The other option is to find victims to do the crappy parts and go for the cool parts. This way, you spend your time doing better kind of work, but the net result sucks, because the victim will feel like a victim and it always shows. I can sense pain in code. A certain kind of the copy-and-paste syndrome is the direct result of the feeling that you don't own something, but are responsible for its problems, and you hate it and are awfully afraid to touch it. "God, why am I doing this?", scream the shadows of the many authors of a huge function with layers piling up for years. So I went right for the crap, sacrificing myself for the sake of the inanimate Big Thing we were all making.

But it's over! That Big Thing is over. Almost. Time to actually do something.

At home, I'd rather code than blog, too. It's just tricky. I hate software. To be passable, software is polished to death. I like code, but doing something releasable at home… And doing something throwaway sounds like a waste of time. I'm looking for something to do, but it's tricky. I recently decided to port valgrind to Windows. The people on the developers mailing list patiently explained that nobody wanted that except me (if you want to instrument Windows binaries, you run wine under valgrind on Linux), and that I was in general out of my mind, 'cause that port would be damn hard. I admire people who released notable home-made software. Maybe they're just way faster than me, maybe more determined, but it's something to admire either way. I don't know what to do in that department at the moment.

So in the meanwhile, I'll be switching from Generalized Writing to coding at work, and at home, I'll keep blogging. Wank wank wank!

Low-level is easy

My previous entry has the amazingly interesting title "Blogging is hard". Gee, what a wuss, says the occasional passer-by. Gotta fix my image, fast. Think, think, think! OK, here's what I'm going to tell you: low-level programming is easy. Compared to higher-level programming, that is. I'm serious.

For starters, here's a story about an amazing developer, who shall rename nameless for the moment. See, I know this amazing developer. Works for Google now. Has about 5 years of client-side web programming under his belt. And the word "nevermore" tattooed all over his body (in the metaphorical sense; I think; I haven't really checked). One time, I decided that I have to understand this nevermore business. "Amazing Developer", I said, "why have you quit the exciting world of web front-ends?" "I don't like it", says the Amazing Developer. "But, but! What about The Second Dot Com Bubble? VC funds will beg you to take their $1M to develop the next Arsebook or what-not. Don't you wanna be rich?" "I really don't like web front-ends very much", he kept insisting. Isn't that strange? How do you explain it? I just kept asking.

Now that I think of it, he probably was a little bit irritated at this point. "Look, pal", he said (in the metaphorical sense; he didn't actually say it like that; he's very polite). "I have a license to drive a 5-ton truck. But I don't want a career in truck driving. Hope this clarifies things". He also said something about your average truck driver being more something than your typical web developer, but I don't remember the exact insult.

Now, I've been working with bare metal hardware for the last 4 years. No OS, no drivers, nothing. In harsh environments in terms of hardware, by comparison. For example, in multi-core desktops, when you modify a cached variable, the cache of the other processor sees the update. In our chips? Forget it. Automatic hardware-level maintenance of memory coherency is pretty fresh news in these markets. And it sucks when you change a variable and then the other processor decides to write back its outdated cache line, overwriting your update. It totally sucks.

A related data point: I'm a whiner, by nature. Whine whine whine. You've probably found this blog through the C++ FQA, so you already know all about my whining. And it's not like I haven't been burnt by low-level bugs. Oh, I had that. Right before deadlines. Weekends of fun. So how come I don't run away from this stuff screaming and shouting? Heck, I don't mind dealing with bare metal machines for the rest of my career. Well, trying out other stuff instead can be somewhat more interesting, but bare metal beats truck driving, I can tell you that. To be fair, I can't really be sure about that last point – I can't drive. At all. Maybe that explains everything?

I'll tell you how I really explain all of this. No, not right in this next sentence; there's a truckload of reasons (about 5 tons), so it might take some paragraphs. Fasten your seatbelts.

What does "high level" basically mean? The higher your level, the more levels you have below you. This isn't supposed to matter in the slightest: at your level, you are given a set of ways to manipulate the existing lower-level environment, and you build your stuff on top of that. Who cares about the number of levels below? The important thing is, how easily can I build my new stuff? If I mess with volatile pointers and hardware registers and overwrite half the universe upon the slightest error, it sucks. If I can pop up a window using a single function, that's the way I like it. Right? Well, it is roughly so, but there are problems.

Problem 1: the stuff below you is huge at higher levels. In my humble opinion, HTML, CSS, JavaScript, XML and DOM are a lot of things. Lots of spec pages. A CPU is somewhat smaller. You have the assembly commands needed to run C (add/multiply, load/store, branch). You have the assembly commands needed to run some sort of OS (move to/from system co-processor register; I dunno, tens of flavors per mid-range RISC core these days?). And you have the interrupt handling rules (put the reset handler here, put the data abort handler here, the address which caused the exception can be obtained thusly). That's all.

I keep what feels like most of ARM946E-S in my brain; stuff that's still outside of my brain probably never got in my way. It's not a particularly impressive result; for example, the fellow sitting next to me can beat me ("physically and intellectually", as the quote by Muhammad Ali goes; in his case, the latter was a serious exaggeration – he was found too dumb for the US army, but I digress). Anyway, this guy next to me has a MIPS 34Kf under his skull, and he looks like he's having fun. That's somewhat more complicated than ARM9. And I worked on quite some MFC-based GUIs (ewww) back in the Rich Client days; at no point I felt like keeping most of MFC or COM or Win32 in my head. I doubt it would fit.

Problem 2: the stuff below you is broken. I've seen hardware bugs; 1 per 100 low-level software bugs and per 10000 high-level software bugs. I think. I feel. I didn't count. But you probably trust me on this one, anyway. How many problems did you have with hardware compared to OS compared to end-user apps? According to most evidence I got, JavaScript does whatever the hell it wants at each browser. Hardware is not like that. CPUs from the same breed will run the user-level instructions identically or get off the market. Memory-mapped devices following a hardware protocol for talking to the bus will actually follow it, damn it, or get off the market.

Low-level things are likely to work correctly since there's tremendous pressure for them to do so. Because otherwise, all the higher-level stuff will collapse, and everybody will go "AAAAAAAAAA!!" Higher-level things carry less weight. OK, so five web apps are broken by this browser update (or an update to a system library used by the browser or any other part of the pyramid). If your web app broke, your best bet is to fix it, not to wait until the problem is resolved at the level below. The higher your level, the loner you become. Not only do you depend on more stuff that can break, there are less people who care in each particular case.

Problem 3: you can rarely fix the broken things below your level. Frequently you don't have the source. Oh, the browser is open source? Happy, happy, joy, joy! You can actually dive into the huge code base (working below your normal level of abstraction where you at least know the vocabulary), fix the bug and… And hope everyone upgrades soon enough. Is this always a smart bet? You can have open source hardware, you know. Hardware is written in symbolic languages, with structured flow and everything. The only trouble is, people at home can't recompile their CPUs. Life cycle. It's all about life cycle. Your higher-level thingie wants to be out there in the wild now, and all those other bits and pieces live according to their own schedules. You end up working around the bug at your end. Sometimes preventing the lower-level component author from fixing the bug, since that would break yours and everybody else's workaround. Whoopsie.

How complicated is your workaround going to be? The complexity raises together with your level of abstraction, too. That's because higher-level components process more complicated inputs. Good workarounds are essentially ways to avoid inputs which break the buggy implementation. Bad workarounds are ways to feed inputs which shouldn't result in the right behavior, but do lead to it with the buggy implementation. Good workarounds are better than bad workarounds because bad workarounds break when the implementation is fixed. But either way, you have to constrain or transform the inputs. Primitive sets of inputs are easier to constrain or transform than complicated sets of inputs. Therefore, low-level bugs are easier to work around. QED.

Low-level: "Don't write that register twice in a row; issue a read between the writes". *Grump* stupid hardware. OK, done. Next.

High-level: "Don't do something, um, shit, I don't know what exactly, well, something to interactive OLE rectangle trackers; they will repaint funny". I once worked on an app for editing forms, much like the Visual Studio 6 resource editor. In my final version, the RectTracker would repaint funny, exactly the way it would in Visual Studio 6 in similar cases. I think I understood the exact circumstances back then, but haven't figured out a reasonable way to avoid them. Apparently the people working at that abstraction level at Microsoft couldn't figure it out, either. What's that? Microsoft software is always crap? You're a moron who thinks everything is absolutely trivial to get right because you've never done anything worthwhile in your entire life. Next.

Problem 4: at higher levels, you can't even understand what's going on. With bare metal machines, you just stop the processor, physically (nothing runs), and then read every bit of memory you want. All the data is managed by a single program, so you can display every variable and see the source code of every function. The ultimate example of the fun that is higher-level debugging is a big, slow, hairy shell script. "rm: No match." Who the hell said that, and how am I supposed to find out? It could be ten sub-shells below. Does it even matter? So what if I couldn't remove some files? Wait, but why were they missing – someone thought they should be there? Probably based on the assumption that a program should have generated them previously, so that program is broken. Which program? AAARGH!!

OK, so shell scripts aren't the best example of high-level languages. Or maybe you think they are; have fun. I don't care. I had my share of language wars. This isn't about languages. I want to move on to the next example. No shell scripts. You have JavaScript (language 1) running inside HTML/CSS (languages 2 & 3) under Firefox (written in language 4) under Windows (no source code), talking to a server written in PHP (language 5, one good one) under Linux (yes source code, but no way to do symbolic debugging of the kernel nonetheless). I think it somewhat complicates the debugging process; surely no single debugger will ever be able to make sense of that.

Problem 5: as you climb higher, the amount of options grows exponentially. A tree has one root, a few thick branches above it, and thousands of leaves at the top. Bend the root and the tree falls. But leaves, those can grow in whichever direction they like.

Linkers are so low-level that they're practically portable, and they're all alike. What can you come up with when you swim that low? Your output is a memory map. A bunch of segments. Base, size, bits, base, size, bits. Kinda limits your creativity. GUI toolkits? The next one is of course easier to master than the first one, but they are really different. What format do you use to specify layouts, which part is data-driven and which is spelled as code? How do you handle the case where the GUI is running on a remote machine? Which UI components are built-in? Do you have a table control with cell joining and stuff or just a list control? Does your edit box check spelling? How? I want to use my own dictionary! Which parts of the behavior of existing controls can be overridden and how? Do you use native widgets on each host, surprising users who switch platforms, or roll your own widgets, surprising the users who don't?

HTML and Qt are both essentially UI platforms. Counts as "different enough" for me. Inevitably, both suck in different ways which you find out after choosing the wrong one (well, it may be obvious with those two from the very beginning; Qt and gtk are probably a better example). Porting across them? Ha.

The fundamental issue is, for many lower-level problems there's The Right Answer (IEEE floating point). Occasionally The Wrong Answer gains some market share and you have to live with that (big endian; lost some market share recently). With higher-level things, it's virtually impossible to define which answer is right. This interacts badly with the ease of hacking up your own incompatible higher-level nightmare. Which brings us to…

Problem 6: everybody thinks high-level is easy, on the grounds that it's visibly faster. You sure can develop more high-level functionality in a given time slot compared to the lower-level kind. So what? You can drive faster than you can walk. But driving isn't easier; everybody can walk, but to drive, you need a license. Perhaps that was the thing mentioned by the Awesome (Ex-Web) Developer: at least truck drivers have licenses. But I'm not sure that's what he said. I'll tell you what I do know for sure: every second WordPress theme I tried was broken out of the box, in one of three ways: (1) PHP error, (2) SQL error and (3) a link pointing to a missing page. WordPress themes are written in CSS and PHP. Every moron can pick up CSS and PHP; apparently, every moron did pick them up. Couldn't they keep the secret at least from some of them? Whaaaam! The speedy 5-ton truck goes right into the tree. Pretty high-level leaves fall off, covering the driver's bleeding corpse under the tender rays of sunset. And don't get me started about the WordPress entry editing window.

Now, while every moron tries his luck with higher-level coding, it's not like everyone doing high-level coding is… you get the idea. The other claim is not true. In fact, this entry is all about how the other claim isn't true. There are lots of brilliant people working on high-level stuff. The problem is, they are not alone. The higher your abstraction level, the lower the quality of the average code snippet you bump into. Because it's easy to hack up by the copy-and-paste method, it sorta works, and if it doesn't, it seems to do, on average, and if it broke your stuff, it quite likely your problem, remember?

Problem 7: it's not just the developers who think it's oh-so-easy. Each and every end user thinks he knows exactly what features you need. Each and every manager thinks so, too. Sometimes they disagree, and no, the manager doesn't always think that "the customer is always right". But that's another matter. The point here is that when you do something "easy", too many people will tell you how it sucks, and you have to just live with that (of course having 100 million users can comfort you, but that is still another matter, and there are times when you can't count on that).

I maintain low-level code, and people are sort of happy with it. Sometimes I think it has sucky bits, which get in your way. In these cases, I actually have to convince people that these things should be changed, because everybody is afraid to break something at that level. Hell, even bug fixes are treated like something nice you've done, as if you weren't supposed to fix your goddamn bugs. Low-level is intimidating. BITS! REGISTERS! HEXADECIMAL! HELP!!

Some people abuse their craft and actively intimidate others. I'm not saying you should do that; in fact, this entry is all about how you shouldn't do that. The people who do it are bastards. I've known such a developer; I call him The Bastard. I might publish the adventures of The Bastard some day, but I need to carefully consider this. I'm pretty sure that the Awesome Developer won't mind if someone recognizes him in a publicly available page, but I'm not so sure about The Bastard for some reason or other.

What I'm saying is, maintaining high-level code is extremely hard. Making a change is easy; making it right without breaking anything isn't. You can drive into a tree in no time. High-level code has a zillion of requirements, and as time goes by, the chance that many of them are implicit and undocumented and nobody even remembers them grows. People don't get it. It's a big social problem. As a low-level programmer, you have to convince people not to be afraid when you give them something. As a high-level programmer, you have to convince them that you can't just give them more and more and MORE. Guess which is easier. It's like paying, which is always less hassle than getting paid. Even if you deal with a Large, Respectful Organization. Swallowing is easier than spitting, even for respectful organizations. Oops, there's an unintended connotation in there. Fuck that. I'm not editing this out. I want to be through with this. Let's push forward.

The most hilarious myth is that "software is easy to fix"; of course it refers to application software, not "system" software. Ever got an e-mail with a ">From" at the beginning of a line? I keep getting those once in a while. Originally, the line said "From" and then it got quoted by sendmail or a descendant. The bug has been around for decades. The original hardware running sendmail is dead. And that hardware had no bugs. The current hardware running sendmail has no bugs, either. Those bugs were fixed somewhere during the testing phase. Application software is never tested like hardware. I know, because I've spent about 9 months, the better part of 2007, writing hardware tests. Almost no features; testing, exclusively. And I was just one of the few people doing testing. You see, you can't fix a hardware bug; it will cost you $1M, at least. The result is that you test the hardware model before manufacturing, and you do fix the bug. But with software, you can always change it later, so you don't do testing. In hardware terms, the testing most good software undergoes would be called "no testing". And then there's already a large installed base, plus quick-and-dirty mailbox-parsing scripts people wrote, and all those mailboxes lying around, and no way to make any sense of them without maintaining bugward compatibility (the term belongs to a colleague of mine, who – guess what – maintains a high-level code base). So you never fix the bug. And most higher-level code is portable; its bugs can live forever.

And the deadlines. The amount of versions of software you can release. 1.0, 1.1, 1.1.7, 1.1.7.3… The higher your abstraction level, the more changes they want, the more intermediate versions and branches you'll wind up with. And then you have to support all of them. Maybe they have to be able to read each other's data files. Maybe they need to load each other's modules. And they are high-level. Lots of stuff below each of them, lots of functionality in them. Lots of modules and data files. Lots of developers, some of whom mindlessly added features which grew important and must be supported. Damn…

I bet that you're convinced that "lower-level" correlates with "easier" by now. Unless you got tired and moved elsewhere, in which case I'm not even talking to you. QED.

Stay tuned for The Adventures of The Bastard.

Blogging is hard

I've already written some stuff here. I read it again and wiped it out. It was self-righteous. I hate self-righteous. I could talk about how I hate self-righteous, but I won't, because that would be self-righteous. See? It's hard to blog without being self-righteous.

I mostly wanted a technical blog, with an occasional sprinkle of life in it, like a picture or something. But mostly technical. Tech blogs I like fall into two overlapping categories: informative and entertaining. Sometimes both. Myself, I sure manage to deliver both in the physical world. "Could you see that bug I bump into?" "Yeah, let's look at that, aha, oh, not this code, SHIT, this thing sucks, it's a torrent of shit, man, it's a ShitTorrent we have here. Ewww, this is so disgusting, wait, what do you mean _next==-1, -1 MY ASS, what the hell… um… get_what?! Here's that stupid fucking bug! Have a nice day." See? Informative and entertaining. Sometimes I even gather little audiences looking over my shoulder when I debug, all because profanity is my number one debugging tool. Catches all the bugs in a snap. Trust me.

And that is loads of fun. Trouble is, it's not necessarily the kind of thing people want to get as a response for their next HTTP request. So I think I'll go for "informative" as first priority. If it works out at all. I have this problem with scaling communication. According to my estimations, the quality of my communication is inversely proportionate to the number of people listening (or people that I think are listening). That is, you get 100% face-to-face with nobody around, 50% if there's two of you and so on. All the way down to a whopping 1% of my exceptional rhetorical skills when I think I'm talking to an audience.

With this kind of personality, blogging is pretty hard. Seems like there's no reason to bother, then, but I think I will, because there's stuff in my brain that wants to come out. I recently spoke to a guy with lots of experience, in a broad sense. He's previously told me a couple of times how it would be wiser to keep my mouth shut in certain contexts. But this time, he said, "sometimes it's extremely hard to hold it when I hear something stupid". What I think happens is, our brains are really cells of a larger brain (shaped like the Internet, of course); when stuff wants to come out, they have to let it out.

And this is going to be in English, because writing about programming in Russian or Hebrew, which are my other options, is frigging ridiculous. Fellow Russians and/or Israelis, stop doing that! You ought to put so many English words into your text, like "threads", "namespaces", "closures", that you end up switching languages twice per sentence. Or you can use those moronic translations of such words. That still counts as switching between languages – one good one and one stupid one. Just write the whole thing in English; makes it way more machine-readable, too (you know, vi). And if your English, like mine, is really just a first-order approximation rather than the real thing, it's not your problem – you won't notice. The native English speakers shouldn't have had their ancestors conquer that much land; now, they'll just have to put up with the consequences.