"High-level CPU": follow-up

February 2nd, 2008

This is a follow-up on the previous entry, the "high-level CPU" challenge. I'll try to summarize the replies and my opinion on the various proposals. But first, a summary of my original points:

  1. "Very" high-level languages have a cost. Attributing this cost to the underlying hardware architecture is wrong. You could move the cost from software to hardware, but that wouldn't eliminate it. I primarily referred to languages characterized by indirection levels and late binding of user-defined operations, such as Lisp and Python, and to a lesser extent/confidence to side-effect-free languages like Haskell. I didn't mean to say that high-level languages should not be used, in fact I think that their cost is wildly overestimated by many. However, denying the existence of any intrinsic cost guarantees that people will keep overestimating it, because if it weren't that high a cost, why would you lie to them? I mean it very seriously; horrible tech marketing is responsible for the death (or coma) of many great things.
  2. Of all systems with similar cost and features, the one that has the least stuff implemented in hardware is the best, because you can change more things. The idea that moving things to hardware is a sure way to make them efficient is a misconception. Hardware can't do "anything in one cycle"; there are many constraints involved. Therefore, it's better to let the software explicitly control a set of low-level components than build hardware logic implementing high-level interfaces to them. For example, to add 2 numbers on a RISC machine, you load them to registers, then add. You could have a command adding operands from memory; it wouldn't run faster, because the hardware would have to spend cycles on loading operands to (implicit) registers. Hardware doesn't have to be a RISC machine, but it's always better to move as much control to software as possible under the given system cost constraints.

I basically asked people to refute point 1 ("HLLs are costly"). What follows describes the attempts people made at it.

Computers you can't program

rainbow-mandala.png

Several readers managed to ignore my references to specific high-level languages and used the opportunity to pimp hardware architectures that can't run those languages. Or any other programming languages designed for human beings, for that matter. Example architectures:

It is my opinion that the fans of this family of hardware/vaporware, consistent advocates of The New Age of Computing, have serious AI problems. Here's a sample quote on cellular automata: "I guess they really are like us." Well, if you want to build a computing device in order to have a relationship with it, maybe a cellular automaton will do the trick. Although I'd recommend to first check the fine selection of Homo Sapiens we have here on Planet Earth. Because those come with lots of features you'd like in a friend, a foe, a spouse or an employee already built-in, while computer hardware has a certain gap to fill in this department.

cellular-automaton.png

Me, I want to build machines to do stuff that someone "like us" wouldn't want to do, for any of the several reasons (the job is hard/boring/stinky/whatever). And once I've built them, I want people to be able to use them. Please note this last point. People and other "nature's computers", like animals and fungi, aren't supposed to be "used". In fact, all those systems spend a huge amount of resources to avoid being used. Machines aren't supposed to be like that. Machines are supposed to do what you want. Which means that both the designer and the user need to control them. Now, a computer that can't even be tricked into parsing HTML in a straightforward way doesn't look like it's built to be controlled, does it?

Let me supply you with an example: Prolog. Prolog is an order of magnitude more tame than a neural net (and two orders of magnitude compared to a cellular automaton) when it comes to "control" – you can implement HTML parsing with it. But Prolog does show alarming signs of independence – it spends most of its time in its inference engine, an elaborate mechanism running lengthy non-trivial loops, which sometimes turn out to be infinite. You aren't supposed to single-step those loops; you're supposed to specify truths about your world, and Prolog will derive more truths for you. Prolog was supposed to be the wave of the future about 25 years ago. I think it can be safely called dead by now, despite the fair amount of money poured into it. I think it died because it's extremely frustrating to use – you just can't tell why the hell it worked that way in each particular case. I've never seen anything remotely as annoying as Prolog, with the notable exception of Makefiles, running on top of a wonderful inference engine of their own.

My current opinion is that neural networks rarely deserve a special hardware implementation – if you need them, build a more traditional computer and run them on top of that; and cellular automata are just stillborn. I might be wrong in the sense that a hardware implementation of these models is the optimal solution for some problem, hence we'll see those beasts in some corner of a successful real-world system. But the vast majority of computing, including AI apps, will run on machines that support basic bread-and-butter programmer things simply and straightforwardly. Here's a Computing Technology Acceptance Lower Bound for ya: if you can't parse a frigging log file with it, you can't do anything with it.

Self-assembly computers

Our next contestant is a machine that you surely can program, once you've built it from the pieces which came in the box. Some people mentioned "FPGA", others failed to call it by its name (one comment mentioned a "giant hypercube of gates", for example). In this part, I'm talking about the suggestions to use an FPGA without further advice on exactly how it should be used; that is, FPGA as the architecture, not FPGA used to prototype an architecture.

Maybe people think that with an FPGA, "everything is possible", so in particular, you could easily build a processor efficiently implementing a HLL. Well, FPGA is just a way to implement hardware allowing you to trade NRE for unit cost. And with hardware, some things are possible and some aren't, or so I claim – for example, you can't magically make the cost of HLLs go away. If you can't think of a way to reduce the overhead HLLs impose on the system cost, citing FPGA doesn't make your argument look any better. On the contrary – you've saved NRE, but you've raised the cost of the hardware by the factor of 5.

Another angle: can you build a compiler? Probably so. Would you like to start your project with building a compiler? Probably not. Now, what makes people think that they want to build hardware themselves? I really don't know. Building hardware is gnarly, FPGA or not – there are lots of constraints you have to think about to make the thing efficient, and it's extremely easy to err on the side of not having enough flexibility. The latter typically happens because you try to implement overly high-level interfaces; it then turns out that you need the same low-level components to do something slightly different.

And changing hardware isn't quite as easy as changing software, even with FPGA, because hardware description code, with its massive parallelism and underlying synthesis constraints, is fairly tricky. FPGA is a perfectly legitimate platform for hardware vendors, but an awful interface for application programmers. If you deliver FPGAs, make it your implementation detail; giving it to application programmers isn't very likely to make them happy in the long run.

At the other end of the spectrum, there's the kind of "self-assembly computer" that reassembles itself automatically, "adapting to the user's needs". Even if it made any sense (and it doesn't), it still wouldn't answer the question: how should this magical hardware adapt to handle HLLs, for example, indirect memory access?

Actual computers designed to run HLLs

Some people mentioned actual hardware which was built to run HLLs, including Reduceron, Tcl on Board, Lisp Machines, Rekursiv, and ARM's Jazelle instruction set. For some reason, nobody mentioned Intel's 432, an object-oriented microprocessor which was supposed to replace x86, but was, among other things, too slow. This illustrates that the existence of a "high-level processor" doesn't mean that it was a good idea (of course it doesn't mean the opposite, either).

I'll now talk about these machines in increasing order of my confidence that the architecture doesn't remove the overhead posed by the HLL it's supposed to run.

Stock computers with bells and whistles

Finally, there was a bunch of suggestions to add specific features to traditional processors.

The good stuff

At the bottom line, there were two hardware-related things which captured my intoxicated imagination: the Reduceron and content-addressable memories. If anything ever materializes around this, I'll send out some samples. In the meanwhile – thanks!