I'm not that good at functional programming and the related discussion of the benefits and drawbacks of side effects. Please forgive me if I'm utterly wrong – or if I'm trivially right because I'm saying something elementary.
So the thing is, very basic complexity analysis that we all use assumes a von Neumann machine with a random access memory. Random access means that you can read an arbitrary array element, and that you can write it.
Suppose we throw away the ability to write array elements, which is a side effect. Instead, we replace it with the "functional" ability to create a new array having the same elements as the old one at all indexes except for one – the one we'd write if we had side effects. So we create an array like this: {old,...,old,new,old,...,old}.
In this sense, aren't we getting a fundamentally inefficient system? We took an O(1) operation and replaced it with an O(N) operation. We don't have to use the new O(N) operation every time we previously used the old O(1) one, of course – we can use an entirely different algorithm instead. But we must be losing something.
For instance, how do you compute a histogram? With side effects, you have a loop over your elements doing histogram[array[i]]++, this is O(N) where N is the amount of elements.
Without side effects, we can create a new histogram at every step – this is O(N*K) where K is the amount of histogram bins. Or we can sort the array – O(N*log(N)) – and then have an O(N) pass over the sorted array, consing every time a new element is found. Both options are asymptotically slower, not just programming-language-shootout slower. Did I miss an O(N) option?
I assume that the people most likely to point out the drawbacks of side effects are also likely to care about asymptotic complexity, because both questions are very basic CS questions. What's the standard workaround for the loss of asymptotic efficiency, then?
One answer could be a smart compiler – smart enough to implement the logical creation of an updated histogram as an update to the memory keeping the old histogram.
But it seems that if we had a compiler that smart, then side effects are not a problem in the first place. An imperative program full of side effects can be trivially implemented as a recursive function taking a state (the entire state of the memory – one large array), creating a new state, and calling itself with this state.
The reason people object to side effects seems related to the fact that such a program is "functional" only in a pointlessly theoretical sense. The nominal lack of side effects gives you no actual added ability to analyze the data flow of such programs.
I haven't formally proved it by this example, of course, but it seems to me that a nominally functional operation of creating a new array just like the old one with a single modified element is not a very good one. It can sometimes be hard or impossible to elide – as in the "imperative program emulation" example – and when it can't be, the need to fall back to actually copying is just too disastrous.
I guess this is why functional languages have cons – make a new list from a new element, the head, and an old list, the tail – but don't have a similar "array consing" primitive for inserting a new element into a random position of an old array.
Now in the absence of a compiler smart enough for efficient "array consing", people seem to be using mutable data structures, such as Haskell's mutable arrays. This seems to repeal some of the correctness guarantees gained by not having side effects. For example:
The
freezeArray
action converts a mutable array into an immutable one by copying, whereasunsafeFreezeArray
returns an immutable array that is effectively just the type cast version of the mutable array. Should you write to the mutable array after it has been (unsafely) frozen, you'll side-effect the immutable array in the process. Please don't :-)
Bugs introduced by side effects coupled with aliasing – you modified something at the wrong time and someone else then references that something – are back.
So what's the point of shunning side effects if you're forced to have them anyway to do some of your computations asymptotically efficiently? Is the point having a clear separation between "pure" code and side-effecting code?
From a "social" more than a "formalistic" standpoint, such a separation indeed makes a lot of sense when "pure" code computes and side-effecting code does I/O. It's awesome to be able to tell pure computations from I/O, and it's actually a pity that imperative language typically allow people to litter computational code with I/O as easily as they do.
But if side effects are also necessary for computational code, then, again from a "social" standpoint, the pure/impure separation doesn't seem like a stable thing that is likely to correspond to how people think of boundaries between parts of the system. A piece of code that doesn't use histograms today might use them tomorrow, and then its purity will be gone.
Generally it seems like a whole lot of algorithms require writable RAM to be efficient. Hough transforms are "glorified histograms". A bunch of algorithms processing graphs or matrices update "rather random" indexes/counters iteratively, with no very clear way to somehow sort these updates so that they are sufficiently non-random to become list consing or similar.
I'm not saying pure functional languages aren't fun or productive for you – a very valid "social" argument in favor of purity. I just wonder how far purity can take you from a complexity analysis standpoint.
If it's not very far, or if it's not very clear how far, then why not have side effects "by default", and use data & control flow analysis to identify places where there are no side effects after all, or where their impact is limited enough (as in, i++ is a nominal side effect but a trivial one)?
In particular, how can purity be "the" way to safe parallelism if it robs you of asymptotic performance, and you're after parallelism to get performance?
[I'm not at all sure I get this right, so I'm asking a "real" question, not a rhetorical one.]